Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork101
Scientific machine learning (SciML) benchmarks, AI for science, and (differential) equation solvers. Covers Julia, Python (PyTorch, Jax), MATLAB, R
License
SciML/SciMLBenchmarks.jl
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
SciMLBenchmarks.jl holds webpages, pdfs, and notebooks showing the benchmarksfor the SciML Scientific Machine Learning Software ecosystem, including:
- Benchmarks of equation solver implementations
- Speed and robustness comparisons of methods for parameter estimation / inverse problems
- Training universal differential equations (and subsets like neural ODEs)
- Training of physics-informed neural networks (PINNs)
- Surrogate comparisons, including radial basis functions, neural operators (DeepONets, Fourier Neural Operators), and more
The SciML Bench suite is made to be a comprehensive open source benchmark from the ground up, covering the methods ofcomputational science and scientific computing all the way to AI for science.
These benchmarks are meant to represent good optimized coding style. Benchmarks are preferred to be run on the provided openbenchmarking hardware for full reproducibility (though in some cases, such as with language barriers, this can be difficult).Each benchmark is documented with the compute devices used along with package versions for necessary reproduction. Thesebenchmarks attempt to measure in terms of work-precision efficiency, either timing with an approximately matching the erroror building work-precision diagrams for direct comparison of speed at given error tolerances.
If any of the code from any of the languages can be improved, please open a pull request.
For critiques of benchmarks, please open a pull request that changes the code in the desired manner. Issues with recommendedchanges are generally vague and not actionable, while pull requests with code changes are exact. Thus if there is somethingyou think should be changed in the code, please make the recommended change in the code!
To view the results of the SciML Benchmarks, go tobenchmarks.sciml.ai. By default, thiswill lead to the latest tagged version of the benchmarks. To see the in-development version of the benchmarks, go tohttps://benchmarks.sciml.ai/dev/.
Static outputs in pdf, markdown, and html reside inSciMLBenchmarksOutput.
To cite the SciML Benchmarks, please cite the following:
@article{rackauckas2019confederated,title={Confederated modular differential equation APIs for accelerated algorithm development and benchmarking},author={Rackauckas, Christopher and Nie, Qing},journal={Advances in Engineering Software},volume={132},pages={1--6},year={2019},publisher={Elsevier}}@article{DifferentialEquations.jl-2017,author ={Rackauckas, Christopher and Nie, Qing},doi ={10.5334/jors.151},journal ={The Journal of Open Research Software},keywords ={Applied Mathematics},note ={Exported from https://app.dimensions.ai on 2019/05/05},number ={1},pages ={},title ={DifferentialEquations.jl – A Performant and Feature-Rich Ecosystem for Solving Differential Equations in Julia},url ={https://app.dimensions.ai/details/publication/pub.1085583166 and http://openresearchsoftware.metajnl.com/articles/10.5334/jors.151/galley/245/download/},volume ={5},year ={2017}}
The following is a quick summary of the benchmarks. These paint broad strokesover the set of tested equations and some specific examples may differ.
- OrdinaryDiffEq.jl's methods are the most efficient by a good amount
- The
Vernmethods tend to do the best in every benchmark of this category - At lower tolerances,
Tsit5does well consistently. - ARKODE and Hairer's
dopri5/dop853perform very similarly, but are bothfar less efficient than theVernmethods. - The multistep methods,
CVODE_Adamsandlsoda, tend to not do very well. - The ODEInterface multistep method
ddeabmdoes not do as well as the othermultistep methods. - ODE.jl's methods are not able to consistently solve the problems.
- Fixed time step methods are less efficient than the adaptive methods.
- In this category, the best methods are much more problem dependent.
- For smaller problems:
Rosenbrock23,lsoda, andTRBDF2tend to be the most efficient at hightolerances.Rodas4PandRodas5Ptend to be the most efficient at low tolerances.
- For larger problems (Filament PDE):
FBDFandQNDFdo the best at all normal tolerances.- The ESDIRK methods like
TRBDF2andKenCarp4can come close.
radauis always the most efficient when tolerances go to the low extreme(1e-13)- Fixed time step methods tend to diverge on every tested problem because thehigh stiffness results in divergence of the Newton solvers.
- ARKODE is very inconsistent and requires a lot of tweaking in order to notdiverge on many of the tested problems. When it doesn't diverge, the similaralgorithms in OrdinaryDiffEq.jl (
KenCarp4) are much more efficient in mostcases. - GeometricIntegrators.jl fails to converge on any of the tested problems.
- Higher order (generally order >=6) symplectic integrators are much moreefficient than the lower order counterparts.
- For high accuracy, using a symplectic integrator is not preferred. Their extracost is not necessary since the other integrators are able to not drift simplydue to having low enough error.
- In this class, the
DPRKNmethods are by far the most efficient. TheVernmethods do well for not being specific to the domain.
- For simple 1-dimensional SDEs at low accuracy, the
EMandRKMilmethodscan do well. Beyond that, they are simply outclassed. - The
SRAandSRImethods both are very similar within-class on the simpleSDEs. SRA3is the most efficient when applicable and the tolerances are low.- Generally, only low accuracy is necessary to get to sampling error of the mean.
- The adaptive method is very conservative with error estimates.
- The high order adaptive methods (
SRIW1) generally do well on stiff problems. - The "standard" low-order implicit methods,
ImplicitEMandImplicitRK, donot do well on all stiff problems. Some exceptions apply to well-behavedproblems like the Stochastic Heat Equation.
- The efficiency ranking tends to match the ODE Tests, but the cutoff fromlow to high tolerance is lower.
Tsit5does well in a large class of problems here.- The
Vernmethods do well in low tolerance cases.
- The Rosenbrock methods, specifically
Rodas5P, perform well.
- Broadly two different approaches have been used, Bayesian Inference and Optimisationalgorithms.
- In general it seems that the optimisation algorithms perform more accurately but that can beattributed to the larger number of data points being used in the optimisation cases, Bayesianapproach tends to be slower of the two and hence lesser data points are used, accuracy canincrease if proper data is used.
- Within the different available optimisation algorithms, BBO from the BlackBoxOptim package and GN_CRS2_LMfor the global case while LD_SLSQP,LN_BOBYQA and LN_NELDERMEAD for the local case from the NLopt packageperform the best.
- Another algorithm being used is theQuadDIRECT algorithm, it gives very good results in the shorter problem casebut doesn't do very well in the case of the longer problems.
- The choice of global versus local optimization make a huge difference in the timings. BBO tends to findthe correct solution for a global optimization setup. For local optimization, most methods in NLopt,like :LN_BOBYQA, solve the problem very fast but require a good initial condition.
- The different backends options available for Bayesian method offer some tradeoffs betweentime, accuracy and control. It is observed that sufficiently high accuracy can be observed withany of the backends with the fine tuning of stepsize, constraints on the parameters, tightness of thepriors and number of iterations being passed.
To generate the interactive notebooks, first install the SciMLBenchmarks, instantiate theenvironment, and then runSciMLBenchmarks.open_notebooks(). This looks as follows:
]add SciMLBenchmarks#master]activate SciMLBenchmarks]instantiateusing SciMLBenchmarksSciMLBenchmarks.open_notebooks()
The benchmarks will be generated at yourpwd() in a folder calledgenerated_notebooks.
Note that when running the benchmarks, the packages are not automatically added. Thus youwill need to add the packages manually or use the internal Project/Manifest tomls toinstantiate the correct packages. This can be done by activating the folder of the benchmarks.For example,
using PkgPkg.activate(joinpath(pkgdir(SciMLBenchmarks),"benchmarks","NonStiffODE"))Pkg.instantiate()
will add all of the packages required to run any benchmark in theNonStiffODE folder.
All of the files are generated from the Weave.jl files in thebenchmarks folder of theSciMLBenchmarks.jl repository. The generation process runs automatically,and thus one does not necessarily need to test the Weave process locally. Instead, simply open a PR that adds/updates afile in thebenchmarks folder and the PR will generate the benchmark on demand. Its artifacts can then be inspected in theBuildkite as described below before merging. Note that it will use the Project.toml and Manifest.toml of the subfolder, soany changes to dependencies requires that those are updated.
Report any bugs or issues atthe SciMLBenchmarks repository.
To see benchmark results before merging, click into the BuildKite, click ontoArtifacts, and then investigate the trained results.
All of the files are generated from the Weave.jl files in thebenchmarks folder. To run the generation process, do for example:
]activate SciMLBenchmarks# Get all of the packagesusing SciMLBenchmarksSciMLBenchmarks.weave_file(joinpath(pkgdir(SciMLBenchmarks),"benchmarks","NonStiffODE"),"linear_wpd.jmd")
To generate all of the files in a folder, for example, run:
SciMLBenchmarks.weave_folder(joinpath(pkgdir(SciMLBenchmarks),"benchmarks","NonStiffODE"))
To generate all of the notebooks, do:
SciMLBenchmarks.weave_all()
Each of the benchmarks displays the computer characteristics at the bottom ofthe benchmark. Since performance-necessary computations are normally performed oncompute clusters, the official benchmarks use a workstation with anAMD EPYC 7502 32-Core Processor @ 2.50GHz to match the performance characteristics ofa standard node in a high performance computing (HPC) cluster or cloud computingsetup.
For almost all equations, there is no analytical solution. A low tolerancereference solution is required in order to compute the error. However, thereare many questions as to the potential of biasing the results via a referencecomputed from a given program. If we use a reference solution from Julia, doesthat make our errors lower?
The answer is no because all of the equation solvers should be convergent tothe same solution. Because of this, it does not matter which solver is usedto generate the reference solution. However, caution is required to ensurethat the reference solution is sufficiently accurate.
Thankfully, there's a very clear indicator of when a reference solution isnot sufficiently correct. Because all other methods will be converging to adifferent solution, there will be a digit of accuracy at which all othersolutions stop converging to the reference. If this occurs, all solutions willgive a straight line, you can see there here:
In this image (taken from the TransistorAmplifierDAE benchmark),the second Rodas5P and Rodas4 are from a different problem implementation, andyou can see they hit lower errors. But all of the others use the samereference solution and seem to "hit a wall" at around 1e-5. This is becausethe chosen reference solution was only 1e-5 accurate. Changing to adifferent reference solution makes them all converge:
This shows that all that truly matters is that the chosen reference issufficiently accurate, and any walling behavior is an indicator that somemethod in the benchmark set is more accurate than the reference (in whichcase the benchmark should be updated to use the more accurate reference).
About
Scientific machine learning (SciML) benchmarks, AI for science, and (differential) equation solvers. Covers Julia, Python (PyTorch, Jax), MATLAB, R
Topics
Resources
License
Code of conduct
Contributing
Security policy
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Sponsor this project
Uh oh!
There was an error while loading.Please reload this page.
Packages0
Uh oh!
There was an error while loading.Please reload this page.

