Movatterモバイル変換


[0]ホーム

URL:


Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
Thehttps:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

NIH NLM Logo
Log inShow account info
Access keysNCBI HomepageMyNCBI HomepageMain ContentMain Navigation
pubmed logo
Advanced Clipboard
User Guide

Full text links

Nature Publishing Group full text link Nature Publishing Group Free PMC article
Full text links

Actions

.2023 Apr 12;14(1):1777.
doi: 10.1038/s41467-023-37236-y.

Combining data and theory for derivable scientific discovery with AI-Descartes

Affiliations

Combining data and theory for derivable scientific discovery with AI-Descartes

Cristina Cornelio et al. Nat Commun..

Abstract

Scientists aim to discover meaningful formulae that accurately describe experimental data. Mathematical models of natural phenomena can be manually created from domain knowledge and fitted to data, or, in contrast, created automatically from large datasets with machine-learning algorithms. The problem of incorporating prior knowledge expressed as constraints on the functional form of a learned model has been studied before, while finding models that are consistent with prior knowledge expressed via general logical axioms is an open problem. We develop a method to enable principled derivations of models of natural phenomena from axiomatic knowledge and experimental data by combining logical reasoning with symbolic regression. We demonstrate these concepts for Kepler's third law of planetary motion, Einstein's relativistic time-dilation law, and Langmuir's theory of adsorption. We show we can discover governing laws from few data points when logical reasoning is used to distinguish between candidate formulae having similar error on the data.

© 2023. The Author(s).

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Visualization of relevant sets and their distances.
The numerical data, background theory, and a discovered model are depicted for Kepler’s third law of planetary motion giving the orbital period of a planet in the solar system. The data consists of measurements (m1, m2, d, p) of the mass of the sunm1, the orbital periodp and massm2 for each planet and its distanced from the sun. The background theory amounts to Newton’s laws of motion, i.e., the formulae for centrifugal force, gravitational force, and equilibrium conditions. The 4-tuples (m1, m2, d, p) are projected into (m1 + m2, d, p). The blue manifold represents solutions offB, which is the function derivable from the background-theory axioms that represents the variable of interest. The gray manifold represents solutions of the discovered modelf. The double arrows indicate the distancesβ ( f ) andε( f ).
Fig. 2
Fig. 2. An interpretation of the scientific method as implemented by our system.
The colors match the respective components of the system in Fig. 3.
Fig. 3
Fig. 3. System overview.
Colored components correspond to our system, and gray components indicate standard techniques for scientific discovery (human-driven or artificial) that have not been integrated into the current system. The colors match the respective components of the discovery cycle of Fig. 2. The present system generates hypotheses from data using symbolic regression, which are posed as conjectures to an automated deductive reasoning system, which proves or disproves them based on background theory or provides reasoning-based quality measures.
Fig. 4
Fig. 4. Depiction of symbolic models for Kepler’s third law of planetary motion giving the orbital period of a planet in the solar system.
The models produced by our SR system are represented by points (ε, β), whereε represents distance to data, andβ represents distance to background theory. Both distances are computed with an appropriate norm on the scaled data.
Fig. 5
Fig. 5. Symbolic regression solutions to two adsorption datasets.
Fig. 5a refers to the methane adsorption on mica at a temperature of 90 K, while Fig. 5b refers to the isobutane adsorption on silicalite at a temperature of 277 K.f2 andg2 are equivalent to the single-site Langmuir equation;g5 andg7 are equivalent to the two-site Langmuir equation.
See this image and copyright information in PMC

References

    1. Koza JR. Genetic Programming: On the Programming of Computers by Means of Natural Selection. Cambridge: MIT Press; 1992.
    1. Koza JR. Genetic Programming II: Automatic Discovery of Reusable Programs. Cambridge: MIT Press; 1994.
    1. Schmidt M, Lipson H. Distilling free-form natural laws from experimental data. Science. 2009;324:81–85. doi: 10.1126/science.1165893. - DOI - PubMed
    1. Martius, G. & Lampert, C. H. Extrapolation and learning equations. InProceedings of the 29th Conference on Neural Information Processing Systems (NIPS-16) (2016).
    1. Iten, R., Metger, T., Wilming, H., Rio, L. & Renner, R. Discovering physical concepts with neural networks.Physical Review Letters124, (2020). - PubMed

Grants and funding

LinkOut - more resources

Full text links
Nature Publishing Group full text link Nature Publishing Group Free PMC article
Cite
Send To

NCBI Literature Resources

MeSHPMCBookshelfDisclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.


[8]ページ先頭

©2009-2026 Movatter.jp