OpenAI’s “AI Chemist” Improved a Reaction Drug Makers Had Nearly Given Up On
Detailed June 17, OpenAI and Molecule.one paired GPT-5.4 with an autonomous lab to improve a stubborn Chan-Lam coupling — lifting yields for 88% of boronic acids and 83% of sulfonamides tested, with 8 of 14 validated reactions more than doubling.
OpenAI and the chemistry-automation startup Molecule.one say they have run a research project in which an AI system did most of the work of a medicinal chemist — reading the literature, dreaming up experiments, ranking them, and then driving the lab robots that carried them out. Detailed on June 17, the collaboration paired OpenAI's GPT-5.4 with Molecule.one's autonomous lab platform, nicknamed Maria, to tackle a reaction that has frustrated drug chemists for years.
The target was a stubborn version of the Chan-Lam coupling, a workhorse method for stitching together pharmaceutically relevant molecules. The specific variant — coupling primary sulfonamides with boronic acids — has historically delivered such low yields that chemists often avoid it, even though it would open up useful chemical space. Improving it is exactly the kind of tedious, high-variable optimization problem where it is hard to know in advance which knob to turn.
According to the writeup, GPT-5.4 reviewed prior studies, generated and scored a slate of research proposals, helped design the experiments, interpreted the data coming back from the bench, and suggested follow-ups — while human chemists stayed in the loop to choose which proposals to test and to validate the final result. The combined system worked through the problem over roughly two and a half months, with another half month for the human team to write everything up.
The numbers were encouraging. Across the broader screen, yields improved for 88% of the boronic acids and 83% of the sulfonamides tested. When human chemists hand-validated 14 representative reactions, 11 came back with higher yields — and 8 of those more than doubled. For a reaction that medicinal chemists had largely written off as unreliable, that is a meaningful jump, and the kind of incremental win that quietly widens what is synthesizable.
Not everyone was dazzled. On Hacker News, working chemists noted that the setup looks a lot like classic high-throughput screening with a smart optimization engine bolted on, and pushed back on the "AI chemist" branding — pointing out that the model proposes and ranks but does not, on its own, understand chemistry the way a trained scientist does. Even granting the skepticism, the project is a concrete data point in a larger story: frontier labs are trying to fold their models into the full scientific loop — hypothesis, experiment, analysis, iteration — rather than treating them as glorified search boxes.
It also lands as OpenAI leans harder into science as a proving ground for its models, a theme running through its recent work on life-sciences benchmarks and lab automation. The pitch is no longer just that a model can pass a chemistry exam, but that it can sit in a real lab, run a real campaign, and leave behind a result a human chemist would be glad to publish.
Comments
Share your thoughts. Be kind.
Loading comments…