- Notifications
You must be signed in to change notification settings - Fork1
A Python Package for Adaptive Spatio-Temporal Exploratory Model (AdaSTEM)
License
chenyangkang/stemflow
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
A Python Package for Adaptive Spatio-Temporal Exploratory Model (AdaSTEM)
pipinstallstemflow
To install the latest beta version from github:
pipinstallstemflow@git+https://github.com/chenyangkang/stemflow.git
Or using conda:
condainstall-cconda-forgestemflow
stemflow is a toolkit for Adaptive Spatio-Temporal Exploratory Model (AdaSTEM [1,2]) in Python. Typical usage is daily abundance estimation usingeBird citizen science data (survey data).
stemflow adopts"split-apply-combine" philosophy. It
- Splits input data usingQuadtree orSphere Quadtree.
- Trains each spatiotemporal split (called stixel) separately.
- Aggregates the ensemble to make the prediction.
The framework leverages the "adjacency" information of surroundings in space and time to model/predict the values of target spatiotemporal points. This framework ameliorates thelong-distance/long-range prediction problem [3], and has a good spatiotemporal smoothing effect.
For more information, please seean introduction to stemflow andlearning curve analysis
Main functionality ofstemflow | Supported indexing | Supported tasks |
|---|---|---|
| ✅ Spatiotemporal modeling & prediction | ✅ User-defined 2D spatial indexing (CRS) | ✅ Binary classification task |
| ✅ Calculate overall feature importances | ✅ 3D spherical indexing | ✅ Regression task |
| ✅ Plot spatiotemporal dynamics | ✅ User-defined temporal indexing | ✅ Hurdle task (two step regression – classify then regress the non-zero part) |
| ✅ Spatial-only modeling | ||
| For details seeAdaSTEM Demo | For details and tips seeTips for spatiotemporal indexing | For details and tips seeTips for different tasks |
| Supported data types | Supported base models |
|---|---|
| ✅ Both continuous and categorical features (prefer one-hot encoding) | ✅ sklearn styleBaseEstimator classes (you can make your own base model), for examplehere |
| ✅ Both static (e.g., yearly mean temperature) and dynamic features (e.g., daily temperature) | ✅ sklearn style Maxent model.Example here. |
| For details and tips seeTips for data types | For details seeBase model choices |
Use Hurdle model as the base model of AdaSTEMRegressor:
fromstemflow.model.AdaSTEMimportAdaSTEM,AdaSTEMClassifier,AdaSTEMRegressorfromstemflow.model.HurdleimportHurdlefromxgboostimportXGBClassifier,XGBRegressor## "hurdle in Ada"model=AdaSTEMRegressor(base_model=Hurdle(classifier=XGBClassifier(tree_method='hist',random_state=42,verbosity=0,n_jobs=1),regressor=XGBRegressor(tree_method='hist',random_state=42,verbosity=0,n_jobs=1) ),# hurdel model for zero-inflated problem (e.g., count)save_gridding_plot=True,ensemble_fold=50,# data are modeled 50 times, each time with jitter and rotation in Quadtree algomin_ensemble_required=30,# Only points covered by > 30 ensembles will be predictedgrid_len_upper_threshold=25,# force splitting if the grid length exceeds 25grid_len_lower_threshold=5,# stop splitting if the grid length fall short 5temporal_start=1,# The next 4 params define the temporal sliding windowtemporal_end=366,temporal_step=20,# The window takes steps of 20 DOY (see AdaSTEM demo for details)temporal_bin_interval=50,# Each window will contain data of 50 DOYpoints_lower_threshold=50,# Only stixels with more than 50 samples are trained and used for predictionSpatio1='longitude',# The next three params define the name ofSpatio2='latitude',# spatial coordinates shown in the dataframeTemporal1='DOY',use_temporal_to_train=True,# In each stixel, whether 'DOY' should be a predictorn_jobs=1,random_state=42)
Fitting and prediction methods follow the style of sklearnBaseEstimator class:
## fitmodel=model.fit(X_train.reset_index(drop=True),y_train)## predictpred=model.predict(X_test)pred=np.where(pred<0,0,pred)eval_metrics=AdaSTEM.eval_STEM_res('hurdle',y_test,pred_mean)print(eval_metrics)
Where thepred is the mean of the predicted values across ensembles.
SeeAdaSTEM demo for further functionality.
Besides,stemflow also supportlazy loading anddatabase query to reduce memory load during parallel computing.
model.gridding_plot# Here, the model is a AdaSTEM class, not a hurdle class
Here, each color shows an ensemble generated during model fitting. In each of the 10 ensembles, regions (in terms of space and time) with more training samples were gridded into finer resolution, while the sparse one remained coarse. Prediction results were aggregated across the ensembles (that is, in this example, data were modeled 10 times).
If you useSphereAdaSTEM module, the gridding plot is aplotly generated interactive object by default:
SeeSphereAdaSTEM demo andInteractive spherical gridding plot.
Daily Abundance Map of Barn Swallow
See sectionAdaSTEM demo for how to generate this GIF.
Chen et al., (2024). stemflow: A Python Package for Adaptive Spatio-Temporal Exploratory Model. Journal of Open Source Software, 9(94), 6158,https://doi.org/10.21105/joss.06158
@article{Chen2024,doi ={10.21105/joss.06158},url ={https://doi.org/10.21105/joss.06158},year ={2024},publisher ={The Open Journal},volume ={9},number ={94},pages ={6158},author ={Yangkang Chen and Zhongru Gu and Xiangjiang Zhan},title ={stemflow: A Python Package for Adaptive Spatio-Temporal Exploratory Model},journal ={Journal of Open Source Software} }
We welcome pull requests. Contributors should followcontributor guidelines.
Application-level cooperation is also welcomed. We recognized that stemflow may consume large computational resources especially as data volume boosts in the future. We always welcome research collaboration of all kinds.
References:
About
A Python Package for Adaptive Spatio-Temporal Exploratory Model (AdaSTEM)
Topics
Resources
License
Code of conduct
Contributing
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.



