- Notifications
You must be signed in to change notification settings - Fork9
A tool for creating Quantitative Structure Property/Activity Relationship (QSPR/QSAR) models.
License
CDDLeiden/QSPRpred
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
QSPRpred is open-source software libary for buildingQuantitative Structure Property/ActivityRelationship (QSPR/QSAR) models developed by Gerard van Westen's Computational DrugDiscovery group. It provides a unified interface for building QSPR models based ondifferent types of descriptors and machine learning algorithms. We developed thispackage to support our research, recognizing the necessity to reduce repetition in ourmodel building workflow and improve the reproducibility and reusability of our models.In making this package available here, we hope that it may be of use to otherresearchers as well. QSPRpred is still in active development, and we welcomecontributions and feedback from the community.
QSPRpred is designed to be modular and extensible, so that new functionality can beeasily added. A command line interface is available for basic use cases to quickly,explore varying scenarios. For more advanced use cases, the Python API offers extraflexibility and control, allowing more complex workflows and additional features.
Internally, QSPRpred relies heavily on theRDKitandscikit-learn libraries. Furthermore,for scikit-learn model saving and loading, QSPRpredusesml2json for safer andinterpretable model serialization. QSPRpred is also interoperablewithPapyrus, a large scalecurated dataset aimed at bioactivity predictions, for data collection. Models developedwith QSPRpred are compatible with the group'sde novo drug designpackageDrugEx.
QSPRpred can be installed with pip like so (with python >= 3.10):
pip install qsprpred
Note that this will install the basic dependencies, but not the optional dependencies.If you want to use the optional dependencies, you can install the package with anoption:
pip install qsprpred[<option>]
The following options are available:
- extra : include extra dependencies for PCM models and extra descriptor sets frompackages other than RDKit
- deep : include deep learning models (torch and chemprop)
- chemprop: include the ChemProp integration (only ChemProp versions < 2.0.0 supported at the moment)
- full : include all optional dependecies (requires cupy,
pip install cupy-cudaX
,replace X with yourcuda version)
If you plan to optionally use QSPRpred to calculate protein descriptors for PCM, makesure to also install Clustal Omega. You can get it viaconda
(for Linux and MacOSonly):
conda install -c bioconda clustalo
or install MAFFT instead:
conda install -c biocore mafft
This is needed to provide multiple sequence alignments for the PCM descriptors. IfWindows is your platform of choice, these tools will need to be installed manually or acustom implementation of theMSAProvider
class will have to be made.
After installation, you will have access to various command line features and you canuse the Python API directly (seeDocumentation). For a quick start, youcan also check out theJupyter notebook tutorials, whichdocument the use of the Python API to build different types of models. The tutorials aswell as the documentation are still work in progress, and we will be happy for anycontributions where it is still lacking.
Contributions and issue reports are more than welcome. Pull requests can be madedirectly to themain
branch and we will transfer themtocontrib when scheduled for thenext release.
About
A tool for creating Quantitative Structure Property/Activity Relationship (QSPR/QSAR) models.