- Notifications
You must be signed in to change notification settings - Fork3
Penalized Regression for Inference of Polygenic Risk Scores
License
shz9/penprs
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
PenPRS is a Python package that includes summary statistics based sparse and accurate penalized regression models for the inference of Polygenic Risk Scores (PRS) using Linkage Disequilibrium (LD) and GWAS Summary Statistics. The models currently available include:
- LASSO
- SSL (Spike-and-slab LASSO PRS)
- SSLAlpha (Alpha-heritability modeled SSL)
PenPRS integrates withmagenpy for data harmonization and LD loading and computation. Details related to model methodology and specifications are present in the paper.
Song, J., Zabad, S., Yang, A., Gravel, S., & Li, Y. (2025).Sparse polygenic risk score inference with the spike-and-slab LASSO.Bioinformatics.https://doi.org/10.1093/bioinformatics/btaf578
To install the package fromGitHub, use the following command:
pip install git+https://github.com/shz9/penprs.git
importmagenpyasmgpfrompenprs.model.LassoimportLassofrompenprs.model.SSLimportSSLfrompenprs.model.SSLAlphaimportSSLAlpha# Load a genotype reference and GWAS summary statistics data (chromosome 22)gdl=mgp.GWADataLoader(mgp.tgp_eur_data_path(),sumstats_files=mgp.ukb_height_sumstats_path(),sumstats_format="fastgwa")# Compute block LD matrix:ld_block_url="https://bitbucket.org/nygcresearch/ldetect-data/raw/ac125e47bf7ff3e90be31f278a7b6a61daaba0dc/EUR/fourier_ls-all.bed"gdl.compute_ld('block',ld_blocks_file=ld_block_url,dtype='int16',compute_spectral_properties=True,output_dir='temp/block_ld/')# Initialize respective modelslasso_model=Lasso(gdl,lam=100)ssl_model=SSL(gdl,l0=750,l1=10)# Perform single fit using model initialized argumentslasso_model.fit()# fit using lam=100ssl_model.fit()# fit using l0=750, l1=10
# Viewing the effect size estimates (BETA) for LASSOlasso_model.to_table().head()
| CHR | SNP | POS | A1 | A2 | BETA |
|---|---|---|---|---|---|
| 22 | rs131538 | 16871137 | A | G | 0.001439 |
| 22 | rs9605903 | 17054720 | C | T | -0.000000 |
| 22 | rs5746647 | 17057138 | G | T | -0.001534 |
| 22 | rs16980739 | 17058616 | T | C | -0.001342 |
| 22 | rs9605923 | 17065079 | A | T | 0.006215 |
# To solve in a path-wise or warm-start fashion# save_intermediate=True to save effect size estimate per ladder steplasso_model_pw=Lasso(gdl)ssl_model_ws=SSL(gdl)# Performs path-wise fit across the default 20 step ladderlasso_model_pw.pathwise_fit(save_intermediate=True)# Performs warm-start fit across the default 20 step ladderssl_model_ws.warm_start_fit(save_intermediate=True)
You can fit theSSL model with Grid-Search (GS) from the command line usingpenprs_fit:
penprs_fit -l path_to_train_ld \-s path_to_train_sumstats \--output-dir path_to_output_dir \-m SSL\# model: options = LASSO, SSL, SSLAlpha--hyp-search GS\# hyperparameter search: options = GS (Grid Search), WS (Warm Start)--validation-ld-panel path_to_validation_ld \--validation-sumstats path_to_validation_sumstats \--use-symmetric-ld \--grid-metric pseudo_validation \--threads 4
For more information on the command line arguments, seepenprs_fit --help
If you use this package in your research, please cite the following paper:
@article{10.1093/bioinformatics/btaf578,author ={Song, Junyi and Zabad, Shadi and Yang, Archer and Gravel, Simon and Li, Yue},title ={Sparse Polygenic Risk Score Inference with the Spike-and-Slab LASSO},journal ={Bioinformatics},pages ={btaf578},year ={2025},month ={10},doi ={10.1093/bioinformatics/btaf578},url ={https://doi.org/10.1093/bioinformatics/btaf578},eprint ={https://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btaf578/64738987/btaf578.pdf},}
About
Penalized Regression for Inference of Polygenic Risk Scores
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Contributors2
Uh oh!
There was an error while loading.Please reload this page.