Jonathan Vandermause1 Steven Torrisi1 Simon Batzner2 Alexie Kolpak2 Boris Kozinsky1

1, Harvard University, Cambridge, Massachusetts, United States
2, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States

Ab initio molecular dynamics (MD) is a powerful tool for accurately probing the dynamics of molecules and solids, but it is limited to system sizes on the order of 1000 atoms and time scales on the order of 10 ps. We present a scheme for rapidly training a machine learning (ML) model of the interatomic force field that approaches the accuracy of ab initio force calculations but can be applied to larger systems over longer time scales. Gaussian Process (GP) models are trained “on-the-fly”, with density-functional theory (DFT) calculations of the atomic forces performed whenever the model encounters atomic configurations sufficiently far outside of the training set. This active learning scheme includes a principled means of deciding when to run additional DFT calculations, accelerating the model's exploration of parameter space while reducing the time spent training the model. Furthermore, we demonstrate that additional ML models can be trained in parallel to predict other quantities of interest, including the ground state energy and charge density, making it possible to efficiently capture with ML the wealth of information provided by full DFT calculations. We demonstrate the flexibility of our approach by testing it on a range of single- and multi-component molecular and solid-state systems, including benzene, silicon, and silicon carbide.