Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Estimate PTM hotspots in protein sequence alignments

License

NotificationsYou must be signed in to change notification settings

evocellnet/ptm_hotspots

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Description

Method to infer PTM (currently only phosphorylation) hotspots in conserved protein alignments based on (Strumilloet al.bioRxiv).

Dependencies

ptm_hotspot requires Python (v3) as well as the following packages:

  • numpy
  • pandas
  • scipy
  • statsmodels

How to predict hotspots

If you want to predict all hotspots in all protein domains just run:

python3 ptm_hotspots.py -o predicted_hotspots.csv

To obtain particular domain predictions:

python3 ptm_hotspots.py -o kinase_domain_hotspots.csv -d PF00069

To obtain hotspot residue predictions instead of hotspot ranges

python3 ptm_hotspots.py -o hotspot_residues.csv --printSites

You can obtain more help and options by typing:

python3 ptm_hotspots.py -h

usage: ptm_hotspots.py [-h] [--dir [PATH]] [--ptmfile [PATH]] [-d [PFXXXXX]]   [--iter [INTEGER]] [--threshold [FLOAT]]   [--foreground [FLOAT]] -o PATH [--printSites]Estimate PTM hotspots in sequence alignmentsoptional arguments:  -h, --help            show this help message and exit  --dir [PATH]          fasta alignments dir (default: db/alignments)  --ptmfile [PATH]      file containing PTMs (default: db/all_phosps)  -d [PFXXXXX], --domain [PFXXXXX]query single domain (i.e. PF00069)  --iter [INTEGER]      number of permutations (default: 100)  --threshold [FLOAT]   Corrected p-value threshold (default: 0.01)  --foreground [FLOAT]  effect-size foreground cutoff (default: 2)  -o PATH, --out PATH   output csv file  --printSites          print all residue predictions

Note: Since the Bonferroni correction depends on the total number of predictions, small disimilarities might emerge in the same domain hotspots depending on whether you run only a domain or the full set of domains. Similarly, the stochastic nature of the permutation analysis might make the results vary between runs.

Customizing alignments or PTM data

By defaultptm_hotspot uses a database containing precalculated domain alignments (as described in Strumillo et al.) as well as a collection of phosphorylated residues derived from public high-throughput mass spectrometry experiments. In order to update the database please consider the next points:

Alignments

Every alignment file should be in FASTA format and the header should contain the start and the end of the domains in the alignment coordinates separated by ";". For example:

>EDP05298 pep:known supercontig:v3.1:DS4 ;51;337

For full protein predictions just include the first and last positions in the multiple sequence alignment.

PTM database

The ptm database should be included as acsv file containing id, amino acid and position of the phosphosite within the protein.

Citation

  • Strumillo, M. J., Oplova, M., Viéitez, C., Ochoa, D., Shahraz, M., Busby, B. P., et al. Sopko, M., Studer, R. A., Perrimon, N., Panse, V. G., Beltrao, P. (2018). Conserved phosphorylation hotspots in eukaryotic protein domain families. bioRxiv.https://doi.org/10.1101/391185

About

Estimate PTM hotspots in protein sequence alignments

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp