- Notifications
You must be signed in to change notification settings - Fork37
star-whale/starwhale
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
English |中文
Starwhale is an MLOps/LLMOps platform that brings efficiency and standardization to machine learning operations. It streamlines the model development liftcycle, enabling teams to optimize their workflows around key areas like model building, evaluation, release and fine-tuning.
Starwhale meets diverse deployment needs with three flexible configurations:
- 🐥Standalone - Deployed in a local development environment, managed by the
swcli
command-line tool, meeting development and debugging needs. - 🦅Server - Deployed in a private data center, relying on a Kubernetes cluster, providing centralized, web-based, and secure services.
- 🦉Cloud - Hosted on a public cloud, with the access addresshttps://cloud.starwhale.cn. The Starwhale team is responsible for maintenance, and no installation is required. You can start using it after registering an account.
As its core, Starwhale abstractsModel,Runtime andDataset as first-class citizens - providing the fundamentals for streamlined operations. Starwhale further delivers tailored capabilities for common workflow scenarios including:
- 🔥Models Evaluation - Implement robust, production-scale evaluations with minimal coding through the Python SDK.
- 🌟Live Demo - Interactively assess model performance through user-friendly web interfaces.
- 🌊LLM Fine-tuning - End-to-end toolchain from efficient fine-tuning to comparative benchmarking and publishing.
Starwhale is also an open source platform, using theApache-2.0 license. The Starwhale framework is designed for clarity and ease of use, empowering developers to build customized MLOps features tailored to their needs.
Starwhale Dataset offers efficient data storage, loading, and visualization capabilities, making it a dedicated data management tool tailored for the field of machine learning and deep learning
importtorchfromstarwhaleimportdataset,Image# build dataset for starwhale cloud instancewithdataset("https://cloud.starwhale.cn/project/starwhale:public/dataset/test-image",create="empty")asds:foriinrange(100):ds.append({"image":Image(f"{i}.png"),"label":i})ds.commit()# load datasetds=dataset("https://cloud.starwhale.cn/project/starwhale:public/dataset/test-image")print(len(ds))print(ds[0].features.image.to_pil())print(ds[0].features.label)torch_ds=ds.to_pytorch()torch_loader=torch.utils.data.DataLoader(torch_ds,batch_size=5)print(next(iter(torch_loader)))
Starwhale Model is a standard format for packaging machine learning models that can be used for various purposes, like model fine-tuning, model evaluation, and online serving. A Starwhale Model contains the model file, inference codes, configuration files, and any other files required to run the model.
# model buildswcli model build. --module mnist.evaluate --runtime pytorch/version/v1 --name mnist# model copy from standalone to cloudswcli model cp mnist https://cloud.starwhale.cn/project/starwhale:public# model runswcli model run --uri mnist --runtime pytorch --dataset mnistswcli model run --workdir. --module mnist.evaluator --handler mnist.evaluator:MNISTInference.cmp
Starwhale Runtime aims to provide a reproducible and sharable running environment for python programs. You can easily share your working environment with your teammates or outsiders, and vice versa. Furthermore, you can run your programs on Starwhale Server or Starwhale Cloud without bothering with the dependencies.
# build from runtime.yaml, conda env, docker image or shellswcli runtime build --yaml runtime.yamlswcli runtime build --conda pytorch --name pytorch-runtime --cuda 11.4swcli runtime build --docker pytorch/pytorch:1.9.0-cuda11.1-cudnn8-runtimeswcli runtime build --shell --name pytorch-runtime# runtime activateswcli runtime activate pytorch# integrated with model and datasetswcli model run --uritest --runtime pytorchswcli model build. --runtime pytorchswcli dataset build --runtime pytorch
Starwhale Evaluation enables users to evaluate sophisticated, production-ready distributed models by writing just a few lines of code with Starwhale Python SDK.
importtypingastimportgradiofromstarwhaleimportevaluationfromstarwhale.api.serviceimportapidefmodel_generate(image): ...returnpredict_value,probability_matrix@evaluation.predict(resources={"nvidia.com/gpu":1},replicas=4,)defpredict_image(data:dict,external:dict)->None:returnmodel_generate(data["image"])@evaluation.evaluate(use_predict_auto_log=True,needs=[predict_image])defevaluate_results(predict_result_iter:t.Iterator):for_datainpredict_result_iter: ...evaluation.log_summary({"accuracy":0.95,"benchmark":"test"})@api(gradio.File(),gradio.Label())defpredict_view(file:t.Any)->t.Any:withopen(file.name,"rb")asf:data=Image(f.read(),shape=(28,28,1))_,prob=predict_image({"image":data})return {i:pfori,pinenumerate(prob)}
Starwhale Fine-tuning provides a full workflow for Large Language Model(LLM) tuning, including batch model evaluation, live demo and model release capabilities. Starwhale Fine-tuning Python SDK is very simple.
importtypingastfromstarwhaleimportfinetune,DatasetfromtransformersimportTrainer@finetune(resources={"nvidia.com/gpu":4,"memory":"32G"},require_train_datasets=True,require_validation_datasets=True,model_modules=["evaluation","finetune"],)deflora_finetune(train_datasets:t.List[Dataset],val_datasets:t.List[Dataset])->None:# init model and tokenizertrainer=Trainer(model=model,tokenizer=tokenizer,train_dataset=train_datasets[0].to_pytorch(),# convert Starwhale Dataset into Pytorch Dataseteval_dataset=val_datasets[0].to_pytorch())trainer.train()trainer.save_state()trainer.save_model()# save weights, then Starwhale SDK will package them into Starwhale Model
Requirements: Python 3.7~3.11 in the Linux or macOS os.
python3 -m pip install starwhale
Starwhale Server is delivered as a Docker image, which can be run with Docker directly or deployed to a Kubernetes cluster. For the laptop environment, usingswcli server start
command is a appropriate choice that depends on Docker and Docker-Compose.
swcli server start
We useMNIST as the hello world example to show the basic Starwhale Model workflow.
- Use your own Python environment, follow theStandalone quickstart doc.
- Use Google Colab environment, follow theJupyter notebook example.
- Run it in the your private Starwhale Server instance, please readServer installation(minikube) andServer quickstart docs.
- Run it in theStarwhale Cloud, please readCloud quickstart doc.
🚀 LLM:
- 🐊 OpenSource LLMs Leaderboard:Evaluation,Code
- 🐢 Llama2:Run llama2 chat in five minutes,Code
- 🦎 Stable Diffusion:Cloud Demo,Code
- 🦙 LLAMAevaluation and fine-tune
- 🎹 Text-to-Music:Cloud Demo,Code
- 🍏 Code Generation:Cloud Demo,Code
🌋 Fine-tuning:
- 🐏 Baichuan2:Cloud Demo,Code
- 🐫 ChatGLM3:Cloud Demo,Code
- 🦏 Stable Diffusion:Cloud Demo,Code
🦦 Image Classification:
- 🐻❄️ MNIST:Cloud Demo,Code.
- 🦫CIFAR10
- 🦓 Vision Transformer(ViT):Cloud Demo,Code
🐃 Image Segmentation:
- Segment Anything(SAM):Cloud Demo,Code
🐦 Object Detection:
- 🦊 YOLO:Cloud Demo,Code
- 🐯Pedestrian Detection
📽️ Video Recognition:UCF101
🦋 Machine Translation:Neural machine translation
🐜 Text Classification:AG News
🎙️ Speech Recognition:Speech Command
VisitStarwhale HomePage.
More information in theofficial documentation.
For general questions and support, join theSlack.
For bug reports and feature requests, please useGithub Issue.
To get community updates, follow@starwhaleai on Twitter.
For Starwhale artifacts, please visit:
- Python Package onPypi.
- Helm Charts onArtifacthub.
- Docker Images onDocker Hub,Github Packages andStarwhale Registry.
Additionally, you can always find us atdeveloper@starwhale.ai.
🌼👏PRs are always welcomed 👍🍺. SeeContribution to Starwhale for more details.
Starwhale is licensed under theApache License 2.0.
About
an MLOps/LLMOps platform