Detail

QM7-X: A comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules

Hoja, Johannes; Medrano Sandonas, Leonardo; Ernst, Brian; Vazquez-Mayagoitia, Alvaro; DiStasio Jr., Robert A.; Tkatchenko, Alexandre

Organizations

MDF Open

Year

2023

Source Name

hoja_qm7x_comprehensive_molecules

DOI

10.18126/5y39-v72p View on Datacite
Here, we introduce QM7-X, a comprehensive dataset of > 40 physicochemical properties for ~4.2 M equilibrium and non-equilibrium structures of small organic molecules with up to seven non-hydrogen (C, N, O, S, Cl) atoms. To span this fundamentally important region of chemical compound space (CCS), QM7-X includes an exhaustive sampling of (meta-)stable equilibrium structures---comprised of constitutional/structural isomers and stereoisomers, e.g., enantiomers and diastereomers (including cis-trans-and conformational isomers)---as well as 100 non-equilibrium structural variations thereof to reach a total of ~4.2 M molecular structures. Computed at the tightly converged quantum-mechanical PBE0+MBD level of theory, QM7-X contains global (molecular) and local (atom-in-a-molecule) properties ranging from ground state quantities (such as atomization energies and dipole moments) to response quantities (such as polarizability tensors and dispersion coefficients). By providing a systematic, extensive, and tightly converged dataset of quantum-mechanically computed physical and chemical properties, we expect that QM7-X will play a critical role in the development of next-generation machine-learning based models for exploring greater swaths of CCS and performing in silico design of molecules with targeted properties.