Movatterモバイル変換

Quantitative structure–activity relationship

From Wikipedia, the free encyclopedia

Predictive chemical model

Quantitative structure–activity relationship models (QSAR models) areregression or classification models used in the chemical and biological sciences and engineering. Like other regression models, QSAR regression models relate a set of "predictor" variables (X) to the potency of theresponse variable (Y), while classification QSAR models relate the predictor variables to a categorical value of the response variable.

In QSAR modeling, the predictors consist of physico-chemical properties or theoreticalmolecular descriptors^[1]^[2] of chemicals; the QSAR response-variable could be abiological activity of the chemicals. QSAR models first summarize a supposed relationship betweenchemical structures andbiological activity in a data-set of chemicals. Second, QSAR modelspredict the activities of new chemicals.^[3]^[4]

Related terms includequantitative structure–property relationships (QSPR) when a chemical property is modeled as the response variable.^[5]^[6]"Different properties or behaviors of chemical molecules have been investigated in the field of QSPR. Some examples are quantitative structure–reactivity relationships (QSRRs), quantitative structure–chromatography relationships (QSCRs) and, quantitative structure–toxicity relationships (QSTRs), quantitative structure–electrochemistry relationships (QSERs), and quantitative structure–biodegradability relationships (QSBRs)."^[7]

As an example, biological activity can be expressed quantitatively as the concentration of a substance required to give a certain biological response. Additionally, when physicochemical properties or structures are expressed by numbers, one can find a mathematical relationship, or quantitative structure-activity relationship, between the two. The mathematical expression, if carefully validated,^[8]^[9]^[10]^[11] can then be used to predict the modeled response of other chemical structures.^[12]

A QSAR has the form of amathematical model:

Activity =f (physiochemical properties and/or structural properties) + error

The error includesmodel error (bias) and observational variability, that is, the variability in observations even on a correct model.

S.No.	Name	Algorithms	External link
1.	R	RF, SVM, Naïve Bayesian, and ANN	"R: The R Project for Statistical Computing".
2.	libSVM	SVM	"LIBSVM -- A Library for Support Vector Machines".
3.	Orange	RF, SVM, and Naïve Bayesian	"Orange Data Mining".
4.	RapidMiner	SVM, RF, Naïve Bayes, DT, ANN, and k-NN	"RapidMiner \| #1 Open Source Predictive Analytics Platform".
5.	Weka	RF, SVM, and Naïve Bayes	"Weka 3 - Data Mining with Open Source Machine Learning Software in Java". Archived fromthe original on 2011-10-28. Retrieved2016-03-24.
6.	Knime	DT, Naïve Bayes, and SVM	"KNIME \| Open for Innovation".
7.	AZOrange^[58]	RT, SVM, ANN, and RF	"AZCompTox/AZOrange: AstraZeneca add-ons to Orange".GitHub. 2018-09-19.
8.	Tanagra	SVM, RF, Naïve Bayes, and DT	"TANAGRA - A free DATA MINING software for teaching and research". Archived fromthe original on 2017-12-19. Retrieved2016-03-24.
9.	Elki	k-NN	"ELKI Data Mining Framework". Archived fromthe original on 2016-11-19.
10.	MALLET		"MALLET homepage".
11.	MOA		"MOA Massive Online Analysis \| Real Time Analytics for Data Streams". Archived fromthe original on 2017-06-19.
12.	Deep Chem	Logistic Regression, Naive Bayes, RF, ANN, and others	"DeepChem".deepchem.io. Retrieved20 October 2017.
13.	alvaModel^[59]	Regression (OLS,PLS,k-NN,SVM and Consensus) and Classification (LDA/QDA,PLS-DA,k-NN,SVM and Consensus)	"alvaModel: a software tool to create QSAR/QSPR models".alvascience.com.
14.	scikit-learn (Python)^[60]	Logistic Regression, Naive Bayes, kNN, RF, SVM, GP, ANN, and others	"SciKit-Learn".scikit-learn.org. Retrieved13 August 2023.
15.	Scikit-Mol^[61]	Integration ofScikit-learn models andRDKit featurization	scikit-mol on pypi.org
16.	scikit-fingerprints^[62]	Molecular fingerprints, API compatible withScikit-learn models	"scikit-fingerprints".GitHub. Retrieved29 December 2024.
17.	DTC Lab Tools	Multiple Linear Regression, Partial Least Squares, Applicability Domain, Validation, and others	"DTCLab Tools". Retrieved12 May 2025.
18.	DTC Lab Supplementary Tools	Quantitative Read-across, q-RASAR, ARKA, Regression and Classification-based ML tools, and others	"DTCLab Supplementary Tools". Retrieved12 May 2025.

Movatterモバイル変換

Essential steps in QSAR studies

SAR and the SAR paradox

Types

Fragment based (group contribution)

3D-QSAR

Chemical descriptor based

String based

Graph based

q-RASAR

Modeling

Data mining approach

Matched molecular pair analysis

Evaluation of the quality of QSAR models

Application

Chemical

Biological

Applications

See also

References

Further reading

External links