SMLQC seminar by Renana Gershoni-Poranne, June 8, 2023

The 8th SMLQC seminar will be given by Renana Gershoni-Poranne on June 8, 2023 (15:00 Paris | 21:00 Beijing | 09:00 New York).

Title

New Representations Enable Interpretable and Generative Deep-Learning for Polycyclic Aromatic Systems

Abstract

Polycyclic Aromatic Systems (PASs) – molecules made up of multiple aromatic rings – are highly important for a variety of functionalities, in particular organic electronics. The structure-property relationships of PASs have both conceptual and practical implications; understanding them can enable design of new functional compounds and elucidation of reactivity in a broader context.

To investigate these relationships in a data-driven manner, we generated a new database – the COMPAS Project1 – which contains the calculated structures and properties of all polybenzenoid hydrocarbons consisting of up to 11 rings, and a sampling of ~500k PAS molecules consisting of 11 types of aromatic and antiaromatic building blocks.

We also developed and implemented two types of molecular representation to enable machine- and deep-learning models to probe the new data: a) a text-based representation2 and b) a graph-based representation.3 

In addition to their predictive ability, we demonstrate the interpretability of the models that is achieved when using these representations. The extracted insight in some cases confirms well-known “rules of thumb” and in other cases disproves common wisdom and sheds new light on this classical family of compounds. In addition to corroborating domain-experts’ interpretation, the different models also highlight additional relationships that are harder for the human eye to discern.  

Finally, we designed and implemented a generative model, GaUDI,4 that uses our new data and representations to successfully design new PASs with targeted properties. GaUDI achieves exceptionally high validity and manages to generate molecules with properties beyond the distribution of the original data set.

References

  1. Wahab, A.; Pfuderer, L.; Paenurk, E.; Gershoni-Poranne, R. The COMPAS Project: A Computational Database of Polycyclic Aromatic Systems. Phase 1: Cata-Condensed Polybenzenoid Hydrocarbons. J. Chem. Inf. Model. 2022, 62 (16), 3704. https://doi.org/10.1021/acs.jcim.2c00503.
  2. Fite, S.; Wahab, A.; Paenurk, E.; Gross, Z.; Gershoni-Poranne, R. Text-Based Representations with Interpretable Machine Learning Reveal Structure-Property Relationships of Polybenzenoid Hydrocarbons. Journal of Physical Organic Chemistry 2022, e4458. https://doi.org/10.1002/poc.4458.
  3. Weiss, T.; Wahab, A.; Bronstein, A. M.; Gershoni-Poranne, R. Interpretable Deep-Learning Unveils Structure-Property Relationships in Polybenzenoid Hydrocarbons. 2022. https://doi.org/10.26434/chemrxiv-2022-krng1.
  4. Weiss, T.; Cosmo, L.; Yanes, E. M.; Chakraborty, S.; Bronstein, A. M.; Gershoni-Poranne, R. Guided Diffusion for Inverse Molecular Design. ChemRxiv April 5, 2023. https://doi.org/10.26434/chemrxiv-2023-z8ltp.

Introduction to the speaker

Renana Gershoni Poranne is an Assistant Professor of Computational Chemistry at the Schulich Faculty of Chemistry at the Technion-Israel Institute of Technology, where she is a Branco Weiss Fellow, Horev Fellow, and Alon Scholarship recipient. Her appointment began in October 2021.

Before joining the faculty at the Technion, Renana was a Senior Scientist (Group Leader) in the group of Prof. Dr. Peter Chen at the Laboratorium für Organische Chemie at the ETH Zürich. Her promotion to Senior Scientist and Lecturer in July 2017 followed a two-year post-doctoral period (as a VATAT postdoctoral fellow) in the same group. She completed her PhD studies under the supervision of Prof. Amnon Stanger in the Schulich Faculty of Chemistry at the Technion, working on elucidation of the properties of aromatic compounds and developing methodologies for the identification and quantification of aromaticity in polycyclic aromatic hydrocarbons. Prior to that, she received her MSc Summa cum Laude for her work on functionlization of corannulene in the group of Prof. Ehud Keinan.

Renana’s research interests lie in the field of computational physical organic chemistry, with particular emphasis on development of methods and tools for better understanding of the physical properties and reactivity of organic and organo-metallic compounds. The work in her group ranges from investigation of fundamental molecular properties and concepts—such as aromaticity, dispersion, metallophilic interactions, catalysis, and mechanism elucidation—to application of machine-learning and deep-learning models for molecular design of novel polycyclic aromatic systems and discovery of structure-property relationships.

How to join

Join Zoom Meeting 
https://zoom.us/j/86004422973?pwd=WjNKQlEydmdFL3hJbUx4NjByYjVJZz09

Meeting ID: 860 0442 2973 

Passcode: 703098

Recordings of the 4th SMLQC seminar are now available, Speaker Daniel Schwalbe-Koda 

The 4th seminar was given by Daniel Schwalbe-Koda on Adversarial Sampling and Extrapolation Trends in NN Potentials. The 1st part of the seminar was a Lecture (https://youtu.be/K9yfi_qZ2OU) followed by a 2nd part with hands-on Tutorial (https://youtu.be/Yn8N34cBeLg). Recordings are also embedded below.

To get updated

SMLQC seminar by Johannes Margraf, May 4, 2023

The 6th SMLQC seminar will be given by Johannes Margraf on May 4, 2023 (22:00 Beijing | 16:00 Paris | 10:00 New York).

Title

Physical Description of Long-Range Interactions in Atomistic Machine Learning Models

Abstract

The dominating paradigm of state-of-the-art machine learning (ML) interatomic potentials is the use of local representations of atomic environments. While this locality has many computational and practical advantages, it ultimately also limits the achievable accuracy of a potentials, since information beyond the cutoff radius is not taken into account. Indeed, long-range interactions can be substantial in bulk systems, most prominently due to the Coulomb interaction, which decays slowly (~ 1/r) with the interatomic distance. These electrostatic interactions are often screened in practice, so that local potentials can still effectively describe polar solids and liquids with surprising accuracy. Unfortunately, this cannot always be relied upon. The inclusion of long-range interactions in ML potentials has therefore been an active field of study in recent years with many different approaches.
 In this seminar, I will focus on approaches that tackle the problem of long-range electrostatics by describing charge distributions via partial charges within the ML model itself. In particular, our recently reported Kernel Charge Equilibration (kQEq) method uses sparse Gaussian Processes to learn atomic electronegativities as a function of the chemical environment. This allows predicting partial charges with a charge equilibration model, including full long-range interactions and non-local charge transfer. Applications of kQEq in predicting molecular dipole moments and developing long-ranged interatomic potentials will be discussed.

Introduction to the speaker

Johannes T. Margraf studied chemistry at the University of Erlangen, where he also obtained his Ph.D. Subsequently he joined the Quantum Theory Project at the University of Florida as a PostDoc, funded by a Feodor-Lynen fellowship. This was followed by another postdoctoral fellowship at the Technical University of Munich. Since 2021, he is a group leader at the Theory Department of the Fritz-Haber-Institute in Berlin. His group focuses on using and developing machine-learning and electronic structure methods to study chemical reactions and discover new functional materials.

How to join

Join Zoom Meeting
https://zoom.us/j/86004422973?pwd=WjNKQlEydmdFL3hJbUx4NjByYjVJZz09

Meeting ID: 860 0442 2973

Passcode: 703098

Recordings of the 3rd SMLQC seminar are now available, Speaker Pascal Friederich

The 3rd seminar was given by Pascal Friederich on ML for Simulation, Understanding, and Design of Molecules and Materials. The 1st part of the seminar was a Lecture (https://youtu.be/WWxdZjXeK7w) followed by the 2nd part with hands-on Tutorial (https://youtu.be/8c8KB0V1bC4). Recordings are also embedded below.

To get updated

Lecture
Tutorial

Recordings of the 2nd SMLQC seminar are now available, Speaker Max Pinheiro Jr

The 2nd seminar was given by Max Pinheiro Jr about his work on nonadiabatic molecular dynamics with machine learning. The 1st part of the seminar was a Lecture (https://youtu.be/9jRxeMzpkLg) followed by the 2nd part with hands-on Tutorial (https://youtu.be/yMDUKhzipj0). Recordings are also embedded below.

To get updated

Lecture
Tutorial

Recordings of the 1st SMLQC seminar are now available, Speaker Arif Ullah

The 2nd seminar was given by Arif Ullah about his work on Quantum Dissipative Dynamics with Machine Learning. The 1st part of the seminar was a Lecture (https://youtu.be/Nx0mSPUaof8) followed by the 2nd part with hands-on Tutorial (https://youtu.be/AuWGiK53P6Y). Recordings are also embedded below.

To get updated

Lecture
Tutorial