Publications

Delocalized, Asynchronous, Closed-Loop Discovery of Organic Laser Emitters

Contemporary materials discovery requires intricate sequences of synthesis, formulation and characterization that often span multiple locations with specialized expertise or instrumentation. To accelerate these workflows, we present a cloud-based strategy that enables delocalized and asynchronous design–make–test–analyze cycles. We showcase this approach through the exploration of molecular gain materials for organic solid-state lasers as a frontier application in molecular optoelectronics. Distributed robotic synthesis and in-line property characterization, orchestrated by a cloud-based AI experiment planner, resulted in the discovery of 21 new state-of-the-art materials. Automated gram-scale synthesis ultimately allowed for the verification of best-in-class stimulated emission in a thin-film device. Demonstrating the asynchronous integration of five laboratories across the globe, this workflow provides a blueprint for delocalizing – and democratizing – scientific discovery.

A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules?

Automation is one of the cornerstones of contemporary material discovery. Bayesian optimization (BO) is an essential part of such workflows, enabling scientists to leverage prior domain knowledge into efficient exploration of a large molecular space. While such prior knowledge can take many forms, there has been significant fanfare around the ancillary scientific knowledge encapsulated in large language models (LLMs). However, existing work thus far has only explored LLMs for heuristic materials searches. Indeed, recent work obtains the uncertainty estimate – an integral part of BO – from point-estimated, non-Bayesian LLMs. In this work, we study the question of whether LLMs are actually useful to accelerate principled Bayesian optimization in the molecular space. We take a sober, dispassionate stance in answering this question. This is done by carefully (i) viewing LLMs as fixed feature extractors for standard but principled BO surrogate models and by (ii) leveraging parameter-efficient finetuning methods and Bayesian neural networks to obtain the posterior of the LLM surrogate. Our extensive experiments with real-world chemistry problems show that LLMs can be useful for BO over molecules, but only if they have been pretrained or finetuned with domain-specific data.

SELFIES and the future of molecular string representations

Artificial intelligence (AI) and machine learning (ML) are expanding in popularity for broad applications to challenging tasks in chemistry and materials science. Examples include the prediction of properties, the discovery of new reaction pathways, or the design of new molecules. The machine needs to read and write fluently in a chemical language for each of these tasks. Strings are a common tool to represent molecular graphs, and the most popular molecular string representation, Smiles, has powered cheminformatics since the late 1980s. However, in the context of AI and ML in chemistry, Smiles has several shortcomings—most pertinently, most combinations of symbols lead to invalid results with no valid chemical interpretation. To overcome this issue, a new language for molecules was introduced in 2020 that guarantees 100% robustness. SELF-referencing embedded string (Selfies). Selfies has since simplified and enabled numerous new applications in chemistry. In this perspective, we look to the future and discuss molecular string representations, along with their respective opportunities and challenges. We propose 16 concrete future projects for robust molecular representations. These involve the extension toward new chemical domains, exciting questions at the interface of AI and robust languages, and interpretability for both humans and machines. We hope that these proposals will inspire several follow-up works exploiting the full potential of molecular string representations for the future of AI in chemistry and materials science.

Direct Dearomatization of Pyridines via an Energy-Transfer-Catalyzed Intramolecular [4+2] Cycloaddition

The catalytic dearomatization of pyridines, accessing medicinally relevant N-heterocycles, is of high interest. Currently direct, dearomative strategies rely generally on reduction or nucleophilic addition, thus limiting the architecture of the dearomatized products to a six-membered ring. We herein introduce a catalytic, dearomative cycloaddition reaction with pyridines using photoinduced energy transfer catalysis, thereby advancing dearomatization methodology and increasing the topology of pyridine dearomatization products. This unprecedented method features high yields, broad substrate scope (44 examples), excellent functional group tolerance, and facile scalability. Furthermore, a recyclable and sustainable polymer immobilized photocatalyst was employed. Computational and experimental investigations support a mechanism in which a cinnamyl moiety is promoted to its corresponding excited triplet state through visible-light-mediated energy transfer catalysis, followed by a regioselective and dearomative [4+2] cycloaddition to pyridines. This work demonstrates the contribution of visible light catalysis toward enabling thermally challenging organic transformations.

Discovery of Unforeseen Energy-Transfer-Based Transformations Using a Combined Screening Approach

The discovery of novel (catalytic) transformations and mechanisms is commonly based on rational design. However, many discoveries have resulted directly from experimental serendipity. Building on this, we report a two-dimensional screening protocol, combining “mechanism-based” and “reaction-based” screening and its application to the field of visible light photocatalysis. To this end, two energy-transfer-based cycloaddition reactions could be realized. A notably endergonic energy transfer process allows for the dearomative cycloaddition of benzothiophenes and related heterocycles. Moreover, by sensitization of enone moieties, a [2+2]-cycloaddition to alkynes and an unexpected cycloaddition-rearrangement cascade were discovered. Advanced spectroscopic techniques (in particular transient absorption spectroscopy and pulse radiolysis) were utilized to investigate the underlying photophysical processes and gain insight into reaction kinetics. Combining these results with further mechanistic analysis can eventually turn out to be helpful upon knowledge-driven development of improved systems. Such screening approaches can thus provide complementary access toward novel and more efficient catalytic protocols.

The energy-transfer-enabled biocompatible disulfide–ene reaction

Sulfur-containing molecules participate in many essential biological processes. Of utmost importance is the methylthioether moiety, present in the proteinogenic amino acid methionine and installed in tRNA by radical-S-adenosylmethionine methylthiotransferases. Although the thiol–ene reaction for carbon–sulfur bond formation has found widespread applications in materials or medicinal science, a biocompatible chemo- and regioselective hydrothiolation of unactivated alkenes and alkynes remains elusive. Here, we describe the design of a general chemoselective anti-Markovnikov hydroalkyl/aryl thiolation of alkenes and alkynes—also allowing the biologically important hydromethylthiolation—by triplet–triplet energy transfer activation of disulfides. This fast disulfide–ene reaction shows extraordinary functional group tolerance and biocompatibility. Transient absorption spectroscopy was used to study the sensitization process in detail. The hereby gained mechanistic insights were successfully employed for optimization of the catalytic system. This photosensitized transformation should stimulate bioimaging applications and carbon–sulfur bond-forming late-stage functionalization chemistry, especially in the context of metabolic labelling.

The energy-transfer-enabled biocompatible disulfide–ene reaction

Sulfur-containing molecules participate in many essential biological processes. Of utmost importance is the methylthioether moiety, present in the proteinogenic amino acid methionine and installed in tRNA by radical-S-adenosylmethionine methylthiotransferases. Although the thiol–ene reaction for carbon–sulfur bond formation has found widespread applications in materials or medicinal science, a biocompatible chemo- and regioselective hydrothiolation of unactivated alkenes and alkynes remains elusive. Here, we describe the design of a general chemoselective anti-Markovnikov hydroalkyl/aryl thiolation of alkenes and alkynes—also allowing the biologically important hydromethylthiolation—by triplet–triplet energy transfer activation of disulfides. This fast disulfide–ene reaction shows extraordinary functional group tolerance and biocompatibility. Transient absorption spectroscopy was used to study the sensitization process in detail. The hereby gained mechanistic insights were successfully employed for optimization of the catalytic system. This photosensitized transformation should stimulate bioimaging applications and carbon–sulfur bond-forming late-stage functionalization chemistry, especially in the context of metabolic labelling.