jiyeonseo/clean-code-mlPublic

forked fromdavified/clean-code-ml

NotificationsYou must be signed in to change notification settings
Fork0
Star0

🛁 Clean Code concepts adapted for machine learning and data science

License

Apache-2.0 license

0 stars 821 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
.circleci		.circleci
bin		bin
docs		docs
images		images
input		input
notebooks		notebooks
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt

Repository files navigation

clean-code-ml

Introduction
Dev Productivity
- Use Docker and stop hearing "Works on my machine!"
- Ensure reproducibility
- VS Code productivity tips (or "Know Your IDE")
Variables
- Variable names should reveal intent
- Use meaningful and pronounceable variable names
- Use the same vocabulary for the same type of variable
- Avoid magic numbers and magic strings
- Use variables to keep code "DRY" ("Don't Repeat Yourself")
- Use explanatory variables
- Avoid mental mapping
- Don't add unneeded context
Dispensables
- Avoid comments
- Remove dead code
- Avoid print statements (even glorified print statements such as df.head(), df.describe(), df.plot())
Functions
- Use functions to keep code "DRY"
- Functions should do one thing
- Functions should only be one level of abstraction
- Function names should say what they do
- Use type hints to improve readability
- Avoid side effects
- Avoid unexpected side effects on values passed as function parameters
- Function arguments (2 or fewer ideally)
- Use default arguments instead of short circuiting or conditionals
- Don't use flags as function parameters
Design
- Avoid exposing your internals (Keep implementation details hidden)

Introduction

Clean code practices (fromClean Code andRefactoring) adapted for machine learning / data science workflows in Python. This is not a style guide. It's a guide to producingreadable, reusable, and refactorable software.

If you’ve tried your hand at machine learning or data science, you would know that code can get messy, quickly.

Unclean code adds to complexity by making code difficult to read and modify. As a consequence, changing code to respond to business needs becomes increasingly difficult, and sometimes even impossible. This has been written about extensively in several languages, and even in Python (e.g. Clean Code, Refactoring, clean-code-python). In this repo, we have adapted these principles for data science / machine learning codebases.

Targets Python3.7+

Inspired byclean-code-javascript and forked fromclean-code-python.

The 5 S's of Clean Code

By James O Coplien (Source: Foreword of Clean Code (Robert C. Martin))

In about 1951, a quality approach called Total Productive Maintenance (TPM) came on the Japanese scene. Its focus is on maintenance rather than on production. One of the major pillars of TPM is the set of so-called 5S principles:

Seiri, or organization (thinksort in English). Knowing where things are—using approaches such as suitable naming—is crucial.
Seiton, or tidiness (thinksystematize in English). There is an old American saying: A place for everything, and everything in its place. A piece of code should be where you expect to find it—and, if not, you should re-factor to get it there.
Seiso, or cleaning (thinkshine in English): Keep the workplace free of hanging wires, grease, scraps, and waste. What do the authors here say about littering your code with comments and commented-out code lines that capture history or wishes for the future? Get rid of them.
Seiketsu, orstandardization: The group agrees about how to keep the workplace clean. Have a consistent coding style and set of practices within the group.
Shutsuke, or discipline (self-discipline). This means having the discipline to follow the practices and to frequently reflect on one’s work and be willing to change.

Hands-on Exercise

If you'd like to try out these practices, we've created arefactoring exercise which you can follow along. Starting witha jupyter notebook with many code smells, you can apply these clean code principles and refactor it to be readable and maintainable. The sample final solution can be found insrc/train.py.

About

🛁 Clean Code concepts adapted for machine learning and data science

Releases

No releases published

Packages

No packages published

Languages

Jupyter Notebook98.1%
Python1.5%
Other0.4%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

clean-code-ml

Table of Contents

Introduction

The 5 S's of Clean Code

Hands-on Exercise

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Languages

Movatterモバイル変換

License

jiyeonseo/clean-code-ml

Folders and files

Latest commit

History

Repository files navigation

clean-code-ml

Table of Contents

Introduction

The 5 S's of Clean Code

Hands-on Exercise

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Languages

Packages