- Notifications
You must be signed in to change notification settings - Fork54
A unified framework for tabular probabilistic regression, time-to-event prediction, and probability distributions in python
License
sktime/skpro
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
🚀Version 2.9.0 out now!Read the release notes here..
skpro
is a library for supervised probabilistic prediction in python.It providesscikit-learn
-like,scikit-base
compatible interfaces to:
- tabularsupervised regressors for probabilistic prediction - interval, quantile and distribution predictions
- tabularprobabilistic time-to-event and survival prediction - instance-individual survival distributions
- metrics to evaluate probabilistic predictions, e.g., pinball loss, empirical coverage, CRPS, survival losses
- reductions to turn
scikit-learn
regressors into probabilisticskpro
regressors, such as bootstrap or conformal - buildingpipelines and composite models, including tuning via probabilistic performance metrics
- symbolicprobability distributions with value domain of
pandas.DataFrame
-s andpandas
-like interface
Overview | |
---|---|
Open Source | |
Tutorials | |
Community | |
CI/CD | |
Code | |
Downloads | |
Citation |
Documentation | |
---|---|
⭐Tutorials | New to skpro? Here's everything you need to know! |
📋Binder Notebooks | Example notebooks to play with in your browser. |
👩💻User Guides | How to use skpro and its features. |
✂️Extension Templates | How to build your own estimator using skpro's API. |
🎛️API Reference | The detailed reference for skpro's API. |
🛠️Changelog | Changes and version history. |
🌳Roadmap | skpro's software and community development plan. |
📝Related Software | A list of related software. |
Questions and feedback are extremely welcome!We strongly believe in the value of sharing help publicly, as it allows a wider audience to benefit from it.
skpro
is maintained by thesktime
community, we use the same social channels.
Type | Platforms |
---|---|
🐛Bug Reports | GitHub Issue Tracker |
✨Feature Requests & Ideas | GitHub Issue Tracker |
👩💻Usage Questions | GitHub Discussions ·Stack Overflow |
💬General Discussion | GitHub Discussions |
🏭Contribution & Development | dev-chat channel ·Discord |
🌐Community collaboration session | Discord - Fridays 13 UTC, dev/meet-ups channel |
Our objective is to enhance the interoperability and usability of the AI model ecosystem:
skpro
is compatible withscikit-learn andsktime, e.g., ansktime
proba forecaster canbe built with anskpro
proba regressor which in ansklearn
regressor with proba mode added byskpro
skpro
provides a mini-package management framework for first-party implementations,and for interfacing popular second- and third-party components,such ascyclic-boosting,MAPIE, orngboost packages.
skpro
curates libraries of components of the following types:
Module | Status | Links |
---|---|---|
Probabilistic tabular regression | maturing | Tutorial ·API Reference ·Extension Template |
Time-to-event (survival) prediction | maturing | Tutorial ·API Reference ·Extension Template |
Performance metrics | maturing | API Reference |
Probability distributions | maturing | Tutorial ·API Reference ·Extension Template |
To installskpro
, usepip
:
pip install skpro
or, with maximum dependencies,
pip install skpro[all_extras]
Releases are available as source packages and binary wheels. You can see all available wheelshere.
fromsklearn.datasetsimportload_diabetesfromsklearn.ensembleimportRandomForestRegressorfromsklearn.linear_modelimportLinearRegressionfromsklearn.model_selectionimporttrain_test_splitfromskpro.regression.residualimportResidualDouble# step 1: data specificationX,y=load_diabetes(return_X_y=True,as_frame=True)X_train,X_new,y_train,_=train_test_split(X,y)# step 2: specifying the regressor - any compatible regressor is valid!# example - "squaring residuals" regressor# random forest for mean prediction# linear regression for variance predictionreg_mean=RandomForestRegressor()reg_resid=LinearRegression()reg_proba=ResidualDouble(reg_mean,reg_resid)# step 3: fitting the model to training datareg_proba.fit(X_train,y_train)# step 4: predicting labels on new data# probabilistic prediction modes - pick any or multiple# full distribution predictiony_pred_proba=reg_proba.predict_proba(X_new)# interval predictiony_pred_interval=reg_proba.predict_interval(X_new,coverage=0.9)# quantile predictiony_pred_quantiles=reg_proba.predict_quantiles(X_new,alpha=[0.05,0.5,0.95])# variance predictiony_pred_var=reg_proba.predict_var(X_new)# mean prediction is same as "classical" sklearn predict, also availabley_pred_mean=reg_proba.predict(X_new)
# step 5: specifying evaluation metricfromskpro.metricsimportCRPSmetric=CRPS()# continuous rank probability score - any skpro metric works!# step 6: evaluat metric, compare predictions to actualsmetric(y_test,y_pred_proba)>>>32.19
There are many ways to get involved with development ofskpro
, which isdeveloped by thesktime
community.We follow theall-contributorsspecification: all kinds of contributions are welcome - not just code.
Documentation | |
---|---|
💝Contribute | How to contribute to skpro. |
🎒Mentoring | New to open source? Apply to our mentoring program! |
📅Meetings | Join our discussions, tutorials, workshops, and sprints! |
👩🔧Developer Guides | How to further develop the skpro code base. |
🏅Contributors | A list of all contributors. |
🙋Roles | An overview of our core community roles. |
💸Donate | Fund sktime and skpro maintenance and development. |
🏛️Governance | How and by whom decisions are made in the sktime community. |
To citeskpro
in a scientific publication, seecitations.
About
A unified framework for tabular probabilistic regression, time-to-event prediction, and probability distributions in python