- Notifications
You must be signed in to change notification settings - Fork95
Python library to parse, format, validate, normalize, and map sequence variants. `pip install hgvs`
License
biocommons/hgvs
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
hgvs - manipulate biological sequence variants according to Human Genome Variation Society recommendations
Important: biocommons packages require Python 3.10+.More
Thehgvs package provides a Python library to parse, format, validate,normalize, and map sequence variants according toVariationNomenclature (aka Human Genome VariationSociety) recommendations.
Specifically, the hgvs package focuses on the subset of the HGVSrecommendations that precisely describe sequence-level variationrelevant to the application of high-throughput sequencing to clinicaldiagnostics. The package does not attempt to cover the full scope ofHGVS recommendations. Please refer toissues for limitations.
- Parsing is based on formal grammar.
- An easy-to-use object model that represents most variant types(SNVs, indels, dups, inversions, etc) and concepts (intronicoffsets, uncertain positions, intervals)
- A variant normalizer that rewrites variants in canonical forms andsubstitutes reference sequences (if reference and transcriptsequences differ)
- Formatters that generate HGVS strings from internal representations
- Tools to map variants between genome, transcript, and proteinsequences
- Reliable handling of regions genome-transcript discrepancies
- Pluggable data providers support alternative sources of transcriptmapping data
- Extensive automated tests, including those for all variant types and"problematic" transcripts
- Easily installed using remote data sources. Installation with localdata sources is straightforward and completely obviates networkaccess
- You are encouraged tobrowseissues. All known issuesare listed there. Please report any issues you find.
- Use a pip package specification to stay within minor releases.For example,
hgvs>=1.5,<1.6
. hgvs usesSemanticVersioning.
Important: For more detailed installation and configurationinstructions, see theHGVS readthedocs
libpqpython3postgresql
Examples for installation:
MacOS :
brew install libpqbrew install python3brew install postgresql@14
Ubuntu :
sudo apt install gcc libpq-dev python3-dev
By default, hgvs uses remote data sources, which makesinstallation easy. If you would like to use local instances of the data sources, see thereadthedocs.
Create a virtual environment using your preferred method.
Example:
python3 -m venv venv
Run the following commands in your virtual environment:
source venv/bin/activate pip install --upgrade setuptools pip install hgvs
SeeInstallationinstructionsfor details, including instructions for installingUniversal TranscriptArchive (UTA) andSeqRepo locally.
Seeexamples andreadthedocs for usage.
The hgvs package is a community effort. Please seeContributing to getstarted in submitting source code, tests, or documentation. Thanks for gettinginvolved!
Existing tests use a cache that is committed with the repo to ensure that tests do not require external networking. To develop new tests, which requires loading the cache, you should install UTA and Seqrepo (and the rest service) locally.
docker compose --project-name biocommons -f $PWD/misc/docker-compose.yml up
IMPORTANT: Loading the test caches is currently hampered b#551,#760, and#761. To load reliably, usemake test-relearn-iteratively
for now.
Other packages that manipulate HGVS variants:
About
Python library to parse, format, validate, normalize, and map sequence variants. `pip install hgvs`