- Notifications
You must be signed in to change notification settings - Fork0
🛁 Clean Code concepts adapted for machine learning and data science
License
TimothySeahGit/clean-code-ml
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
- Introduction
- Variables
- Variable names should reveal intent
- Use meaningful and pronounceable variable names
- Use the same vocabulary for the same type of variable
- Avoid magic numbers and magic strings
- Use variables to keep code "DRY" ("Don't Repeat Yourself")
- Use explanatory variables
- Avoid mental mapping
- Don't add unneeded context
- Dispensables
- Avoid comments
- Remove dead code
- Avoid print statements (even glorified print statements such as df.head(), df.describe(), df.plot())
- Functions
- Use functions to keep code "DRY"
- Functions should do one thing
- Functions should only be one level of abstraction
- Function names should say what they do
- Use type hints to improve readability
- Avoid side effects
- Avoid unexpected side effects on values passed as function parameters
- Function arguments (2 or fewer ideally)
- Use default arguments instead of short circuiting or conditionals
- Don't use flags as function parameters
- Design
- Avoid exposing your internals (Keep implementation details hidden)
Clean code practices (fromClean Code andRefactoring) adapted for machine learning / data science workflows in Python. This is not a style guide. It's a guide to producingreadable, reusable, and refactorable software.
If you’ve tried your hand at machine learning or data science, you would know that code can get messy, quickly.
Unclean code adds to complexity by making code difficult to read and modify. As a consequence, changing code to respond to business needs becomes increasingly difficult, and sometimes even impossible. This has been written about extensively in several languages, and even in Python (e.g. Clean Code, Refactoring, clean-code-python). In this repo, we have adapted these principles for data science / machine learning codebases.
Targets Python3.7+
Inspired byclean-code-javascript and forked fromclean-code-python.
If you'd like to try out these practices, we've created arefactoring exercise which you can follow along. Starting witha jupyter notebook with many code smells, you can apply these clean code principles and refactor it to be readable and maintainable. The sample final solution can be found insrc/train.py.
About
🛁 Clean Code concepts adapted for machine learning and data science
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Languages
- Jupyter Notebook98.4%
- Python1.4%
- Other0.2%