capitalone/rubicon-mlPublic

NotificationsYou must be signed in to change notification settings
Fork35
Star133

Capture all information throughout your model's development in a reproducible way and tie results directly to the model code!

License

Apache-2.0 license

133 stars 35 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 313 Commits
.github		.github
binder		binder
docs		docs
notebooks		notebooks
rubicon_ml		rubicon_ml
tests		tests
.coveragerc		.coveragerc
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CODEOWNERS		CODEOWNERS
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
conftest.py		conftest.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
versioneer.py		versioneer.py

Repository files navigation

rubicon-ml

Purpose

rubicon-ml is a data science tool that captures and stores model training andexecution information, like parameters and outcomes, in a repeatable andsearchable way. Itsgit integration associates these inputs and outputsdirectly with the model code that produced them to ensure full auditability andreproducibility for both developers and stakeholders alike. While experimenting,the dashboard makes it easy to explore, filter, visualize, and sharerecorded work.

p.s. If you're looking for Rubicon, the Java/ObjC Python bridge, visitthis instead.

Components

rubicon-ml is composed of three parts:

A Python library for storing and retrieving model inputs, outputs, andanalyses to filesystems that’s powered byfsspec
A dashboard for exploring, comparing, and visualizing logged data built withdash
And a process for sharing a selected subset of logged data with collaboratorsor reviewers that leveragesintake

Workflow

Userubicon_ml to capture model inputs and outputs over time. It can beeasily integrated into existing Python models or pipelines and supports bothconcurrent logging (so multiple experiments can be logged in parallel) andasynchronous communication with S3 (so network reads and writes won’t block).

Meanwhile, periodically review the logged data within the Rubicon dashboard tosteer the model tweaking process in the right direction. The dashboard lets youquickly spot trends by exploring and filtering your logged results andvisualizes how the model inputs impacted the model outputs.

When the model is ready for review, Rubicon makes it easy to share specificsubsets of the data with model reviewers and stakeholders, giving them thecontext necessary for a complete model review and approval.

Use

Check out theinteractive notebooks in this Binderto tryrubicon_ml for yourself.

Here's a simple example:

fromrubicon_mlimportRubiconrubicon=Rubicon(persistence="filesystem",root_dir="/rubicon-root",auto_git_enabled=True)project=rubicon.create_project("Hello World",description="Using rubicon to track model results over time.")experiment=project.log_experiment(training_metadata=[SklearnTrainingMetadata("sklearn.datasets","my-data-set")],model_name="My Model Name",tags=["my_model_name"],)experiment.log_parameter("n_estimators",n_estimators)experiment.log_parameter("n_features",n_features)experiment.log_parameter("random_state",random_state)accuracy=rfc.score(X_test,y_test)experiment.log_metric("accuracy",accuracy)

Then explore the project by running the dashboard:

rubicon_ml ui --root-dir /rubicon-root

Documentation

For a full overview, visit thedocs. Ifyou have suggestions or find a bug,please open anissue.

Install

The Python library is available on Conda Forge viaconda and PyPi viapip.

conda config --add channels conda-forgeconda install rubicon-ml

pip install rubicon-ml

Develop

The project uses conda to manage environments. First, installconda.Then use conda to setup a development environment:

conda env create -f environment.ymlconda activate rubicon-ml-dev

Finally, installrubicon_ml locally into the newly created environment.

pip install -e".[all]"

Testing

The tests are separated into unit and integration tests. They can be rundirectly in the activated dev environment viapytest tests/unit orpytest tests/integration. Or by simply runningpytest to execute all of them.

Note: some integration tests are intentionallymarked to control when theyare run (i.e. not during CICD). These tests include:

Integration tests that write to physical filesystems - local and S3. Localfiles will be written to./test-rubicon relative to where the tests are run.An S3 path must also be provided to run these tests. By default, thesetests are disabled. To enable them, run:
```
pytest -m "write_files" --s3-path "s3://my-bucket/my-key"
```
Integration tests that run Jupyter notebooks. These tests are a bit slowerthan the rest of the tests in the suite as they need to launch Jupyter servers.By default, they are enabled. To disable them, run:
```
pytest -m "not run_notebooks and not write_files"
```
Note: When simply runningpytest,-m "not write_files" is thedefault. So, we need to also apply it when disabling notebook tests.

Code Formatting

Install and configure pre-commit to automatically runblack,flake8, andisort during commits:

install pre-commit
runpre-commit install to set up the git hook scripts

Nowpre-commit will run automatically on git commit and will ensure consistentcode format throughout the project. You can format without committing viapre-commit run or skip these checks withgit commit --no-verify.

About

Capture all information throughout your model's development in a reproducible way and tie results directly to the model code!

capitalone.github.io/rubicon-ml/

Releases66

v0.11.0 Latest

Jun 12, 2025

+ 65 releases

Packages

No packages published

Contributors20

+ 6 contributors

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

rubicon-ml

Purpose

Components

Workflow

Use

Documentation

Install

Develop

Testing

Code Formatting

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases66

Packages

Uh oh!

Contributors20

Uh oh!

Languages

Movatterモバイル変換

License

capitalone/rubicon-ml

Folders and files

Latest commit

History

Repository files navigation

rubicon-ml

Purpose

Components

Workflow

Use

Documentation

Install

Develop

Testing

Code Formatting

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases66

Packages0

Uh oh!

Contributors20

Uh oh!

Languages

Packages