mlflow/mlflowPublic

NotificationsYou must be signed in to change notification settings
Fork4.7k
Star21.2k

Open source platform for the machine learning lifecycle

License

Apache-2.0 license

21.2k stars 4.7k forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 8,182 Commits
.circleci		.circleci
.devcontainer		.devcontainer
.github		.github
assets		assets
dev		dev
docker		docker
docs		docs
examples		examples
mlflow		mlflow
packages/tracing		packages/tracing
requirements		requirements
skinny		skinny
tests		tests
.dockerignore		.dockerignore
.git-blame-ignore-revs		.git-blame-ignore-revs
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.rst		CODE_OF_CONDUCT.rst
COMMITTER.md		COMMITTER.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
EXTRA_DEPENDENCIES.rst		EXTRA_DEPENDENCIES.rst
ISSUE_POLICY.md		ISSUE_POLICY.md
ISSUE_TRIAGE.rst		ISSUE_TRIAGE.rst
LICENSE.txt		LICENSE.txt
MANIFEST.in		MANIFEST.in
README.md		README.md
SECURITY.md		SECURITY.md
conftest.py		conftest.py
mlflow-charter.pdf		mlflow-charter.pdf
pyproject.release.toml		pyproject.release.toml
pyproject.toml		pyproject.toml

Repository files navigation

MLflow: A Machine Learning Lifecycle Platform

MLflow is an open-source platform, purpose-built to assist machine learning practitioners and teams in handling the complexities of the machine learning process. MLflow focuses on the full lifecycle for machine learning projects, ensuring that each phase is manageable, traceable, and reproducible

The core components of MLflow are:

Experiment Tracking 📝: A set of APIs to log models, params, and results in ML experiments and compare them using an interactive UI.
Model Packaging 📦: A standard format for packaging a model and its metadata, such as dependency versions, ensuring reliable deployment and strong reproducibility.
Model Registry 💾: A centralized model store, set of APIs, and UI, to collaboratively manage the full lifecycle of MLflow Models.
Serving 🚀: Tools for seamless model deployment to batch and real-time scoring on platforms like Docker, Kubernetes, Azure ML, and AWS SageMaker.
Evaluation 📊: A suite of automated model evaluation tools, seamlessly integrated with experiment tracking to record model performance and visually compare results across multiple models.
Observability 🔍: Tracing integrations with various GenAI libraries and a Python SDK for manual instrumentation, offering smoother debugging experience and supporting online monitoring.

Installation

To install the MLflow Python package, run the following command:

pip install mlflow

Alternatively, you can install MLflow from on different package hosting platforms:


PyPI
conda-forge
CRAN
Maven Central

Documentation 📘

Official documentation for MLflow can be found athere.

Running Anywhere 🌐

You can run MLflow on many different environments, including local development, Amazon SageMaker, AzureML, and Databricks. Please refer tothis guidance for how to setup MLflow on your environment.

Usage

Experiment Tracking (Doc)

The following examples trains a simple regression model with scikit-learn, while enabling MLflow'sautologging feature for experiment tracking.

importmlflowfromsklearn.model_selectionimporttrain_test_splitfromsklearn.datasetsimportload_diabetesfromsklearn.ensembleimportRandomForestRegressor# Enable MLflow's automatic experiment tracking for scikit-learnmlflow.sklearn.autolog()# Load the training datasetdb=load_diabetes()X_train,X_test,y_train,y_test=train_test_split(db.data,db.target)rf=RandomForestRegressor(n_estimators=100,max_depth=6,max_features=3)# MLflow triggers logging automatically upon model fittingrf.fit(X_train,y_train)

Once the above code finishes, run the following command in a separate terminal and access the MLflow UI via the printed URL. An MLflowRun should be automatically created, which tracks the training dataset, hyper parameters, performance metrics, the trained model, dependencies, and even more.

mlflow ui

Serving Models (Doc)

You can deploy the logged model to a local inference server by a one-line command using the MLflow CLI. Visit the documentation for how to deploy models to other hosting platforms.

mlflow models serve --model-uri runs:/<run-id>/model

Evaluating Models (Doc)

The following example runs automatic evaluation for question-answering tasks with several built-in metrics.

importmlflowimportpandasaspd# Evaluation set contains (1) input question (2) model outputs (3) ground truthdf=pd.DataFrame(    {"inputs": ["What is MLflow?","What is Spark?"],"outputs": ["MLflow is an innovative fully self-driving airship powered by AI.","Sparks is an American pop and rock duo formed in Los Angeles.",        ],"ground_truth": ["MLflow is an open-source platform for managing the end-to-end machine learning (ML) ""lifecycle.","Apache Spark is an open-source, distributed computing system designed for big data ""processing and analytics.",        ],    })eval_dataset=mlflow.data.from_pandas(df,predictions="outputs",targets="ground_truth")# Start an MLflow Run to record the evaluation results towithmlflow.start_run(run_name="evaluate_qa"):# Run automatic evaluation with a set of built-in metrics for question-answering modelsresults=mlflow.evaluate(data=eval_dataset,model_type="question-answering",    )print(results.tables["eval_results_table"])

Observability (Doc)

MLflow Tracing provides LLM observability for various GenAI libraries such as OpenAI, LangChain, LlamaIndex, DSPy, AutoGen, and more. To enable auto-tracing, callmlflow.xyz.autolog() before running your models. Refer to the documentation for customization and manual instrumentation.

importmlflowfromopenaiimportOpenAI# Enable tracing for OpenAImlflow.openai.autolog()# Query OpenAI LLM normallyresponse=OpenAI().chat.completions.create(model="gpt-4o-mini",messages=[{"role":"user","content":"Hi!"}],temperature=0.1,)

Then navigate to the "Traces" tab in the MLflow UI to find the trace records OpenAI query.

Community

For help or questions about MLflow usage (e.g. "how do I do X?") visit thedocsorStack Overflow.
Alternatively, you can ask the question to our AI-powered chat bot. Visit the doc website and click on the"Ask AI" button at the right bottom to start chatting with the bot.
To report a bug, file a documentation issue, or submit a feature request, pleaseopen a GitHub issue.
For release announcements and other discussions, please subscribe to our mailing list (mlflow-users@googlegroups.com)or join us onSlack.

Contributing

We happily welcome contributions to MLflow! We are also seeking contributions to items on theMLflow Roadmap. Please see ourcontribution guide to learn more about contributing to MLflow.

Citation

If you use MLflow in your research, please cite it using the "Cite this repository" button at the top of theGitHub repository page, which will provide you with citation formats including APA and BibTeX.

Core Members

MLflow is currently maintained by the following core members with significant contributions from hundreds of exceptionally talented community members.