Using data-driven methodologies, researchers have identified a “freedom of design” in molecular structures due to weak correlations in quantum-mechanical properties. This discovery, coupled with machine learning, could revolutionize molecular design and drug discovery.
The exploration of the remarkably vast space of molecules and materials with data-driven approaches has inspired countless academic and industrial initiatives to seek out the fundamental relationships that exist between the structural signatures of molecules and their physicochemical properties. While there has been significant progress in this area, a comprehensive understanding of these complex relationships – even in the more manageable sector of CCS spanned by small molecules – was still lacking despite the critical importance and high relevance of such molecules throughout the chemical and pharmaceutical sciences.
“Unravelling complex relationships between molecular structures and properties would not only provide us with the tools needed to explore and characterize the molecular space, but it would also greatly advance our ability to rationally design molecules with targeted array of physicochemical properties,” says Alexandre Tkatchenko, professor of Theoretical Chemical Physics in the Department of Physics and Materials Science at the University of Luxembourg.
Weak Correlations Enable “Freedom of Design”
In the paper entitled “’Freedom of Design’ in Chemical Compound Space: Towards Rational in Silico Design of Molecules with Targeted Quantum-Mechanical Properties,” published in the prestigious journal Chemical Science, one of the key findings is that most of the quantum-mechanical properties of small molecules are only weakly correlated.
“While one might initially view this finding as a challenge for rational molecular design, we argue that our analysis highlights an intrinsic flexibility – or “freedom of design” – that exists in CCS, wherein there seems to be very few limitations preventing a molecule from simultaneously exhibiting any pair of properties or for many molecules sharing an array of properties,” says Robert DiStasio Jr., professor of Theoretical Chemistry at Cornell University.
Searching for Optimal Pathways in Chemical Space
To explore how this intrinsic flexibility will manifest in the molecular design process, which often involves the simultaneous optimization of multiple physicochemical properties, the authors used Pareto multi-property optimization to search for molecules with simultaneously large molecular polarisability and electronic gap, a design task of relevance for identifying novel molecules for polymeric batteries. The authors found paths through chemical space consisting of several unexpected molecules connected by structural and/or compositional changes, reflecting the freedom in the rational design and discovery of molecules with targeted property values.
“A potentially interesting next step would be to use these Pareto-optimal structures in conjunction with powerful machine learning approaches to build reliable multi-objective frameworks for a systematic navigation of hitherto unexplored chemical spaces,” explains Prof. Tkatchenko.
Implications for the Molecular Design Paradigm
“By demonstrating that “freedom of design” is a fundamental and emergent property of CCS, our work has a number of important implications in the fields of rational molecular design and computational drug discovery. For one, we hope this work will challenge the chemical sciences community to consider how such intrinsic flexibility can be used to extend the dominant paradigm in the forward molecular design process. We also hope that this work will enable substantive progress towards solving the inverse molecular design problem, in which one seeks to find a molecule (or set of molecules) corresponding to a targeted array of properties,” explains Dr. Leonardo Medrano Sandonas, postdoctoral researcher in the Theoretical Chemical Physics group at the University of Luxembourg.
The combination of the insights gained from this work with advanced machine learning approaches could aid in the development of effective strategies for high-throughput screening of novel molecules tailored to a specific application, which is a prominent research direction in Prof. Tkatchenko’s group.
Reference: “’Freedom of design’ in chemical compound space: towards rational in silico design of molecules with targeted quantum-mechanical properties” by Leonardo Medrano Sandonas, Johannes Hoja, Brian G. Ernst, Álvaro Vázquez-Mayagoitia, Robert A. DiStasio, Jr and Alexandre Tkatchenko, 18 August 2023, Chemical Science.
DOI: 10.1039/D3SC03598K
The research team used the high-performance computing resources of the Argonne Leadership Computing Facility (ALCF), a DOE Office of Science user facility.