- Notifications
You must be signed in to change notification settings - Fork0
Python learning to rank (LTR) toolkit
License
bazvalya/pyltr
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
pyltr is a Python learning-to-rank toolkit with ranking models, evaluationmetrics, data wrangling helpers, and more.
This software is licensed under the BSD 3-clause license (seeLICENSE.txt
).
The author may be contacted atma127jerry <@t> gmail
with generalfeedback, questions, or bug reports.
Import pyltr:
import pyltr
Import aLETOR dataset(e.g.MQ2007):
with open('train.txt') as trainfile, \ open('vali.txt') as valifile, \ open('test.txt') as evalfile: TX, Ty, Tqids, _ = pyltr.data.letor.read_dataset(trainfile) VX, Vy, Vqids, _ = pyltr.data.letor.read_dataset(valifile) EX, Ey, Eqids, _ = pyltr.data.letor.read_dataset(evalfile)
Train aLambdaMART model, usingvalidation set for early stopping and trimming:
metric = pyltr.metrics.NDCG(k=10)# Only needed if you want to perform validation (early stopping & trimming)monitor = pyltr.models.monitors.ValidationMonitor( VX, Vy, Vqids, metric=metric, stop_after=250)model = pyltr.models.LambdaMART( metric=metric, n_estimators=1000, learning_rate=0.02, max_features=0.5, query_subsample=0.5, max_leaf_nodes=10, min_samples_leaf=64, verbose=1,)model.fit(TX, Ty, Tqids, monitor=monitor)
Evaluate model on test data:
Epred = model.predict(EX)print 'Random ranking:', metric.calc_mean_random(Eqids, Ey)print 'Our model:', metric.calc_mean(Eqids, Ey, Epred)
Below are some of the features currently implemented in pyltr.
- LambdaMART (
pyltr.models.LambdaMART
)- Validation & early stopping
- Query subsampling
- (N)DCG (
pyltr.metrics.DCG
,pyltr.metrics.NDCG
)- pow2 and identity gain functions
- ERR (
pyltr.metrics.ERR
)- pow2 and identity gain functions
- (M)AP (
pyltr.metrics.AP
) - Kendall's Tau (
pyltr.metrics.KendallTau
) - AUC-ROC -- Area under the ROC curve (
pyltr.metrics.AUCROC
)
- Data loaders (e.g.
pyltr.data.letor.read
) - Query groupers and validators(
pyltr.util.group.check_qids
,pyltr.util.group.get_groups
)
Use therun_tests.sh
script to run all unit tests.
cd
into thedocs/
directory and runmake html
. Docs are generatedin thedocs/_build
directory.
Quality contributions or bugfixes are gratefully accepted. When submitting apull request, please updateAUTHOR.txt
so you can be recognized for yourwork :).
By submitting a Github pull request, you consent to have your submitted codereleased under the terms of the project's license (seeLICENSE.txt
).
About
Python learning to rank (LTR) toolkit
Resources
License
Stars
Watchers
Forks
Packages0
Languages
- Python99.8%
- Shell0.2%