Air conditioners, refrigerators and cooling systems in data centers all rely on refrigerants to transfer heat. But many of these chemicals are also potent greenhouse gases. Our AI system explores the chemical space to generate new molecules and speed up the search for alternatives.
A single kilogram of some hydrofluorocarbons (HFCs) used as refrigerants today can have the same warming effect as 1,000kg of CO2. With refrigerants accounting for around 2% of global greenhouse gas (GHG) emissions –roughly equivalent to the aviation sector– and demand for refrigerants exploding worldwide, finding alternatives is urgent.
RefGen is a physics-informed AI system that ventures beyond known chemical databases to discover unknown refrigerant molecules that are efficient, safe, and environmentally friendly. By combining machine learning with thermodynamic principles, RefGen quickly generates hundreds of equivalent or better performing candidates while dramatically reducing environmental impact.
Finding a suitable refrigerant means solving a puzzle with competing constraints. The ideal molecule must be thermodynamically efficient (high cooling power), safe (non-flammable, non-toxic), and environmentally benign (low global warming potential, or GWP).
Despite screening hundreds of thousands of known molecules, researchers have identified only about 300 viable refrigerants that have been or are still currently used — not enough data to train an AI model on. A landmark study screened 460 million structures and found that applying all constraints simultaneously only yields 27 candidates, all of which are still flammable.
Why so few candidates? Because existing approaches only search through molecules that we already know.
AI Meets Physics
RefGen innovates on three fronts to accelerate molecule discovery.
- We fine-tuned an open-source language model on millions of molecular structures, teaching it to generate valid molecules as text representations.
- RefGen uses established physics on top of machine learning: neural networks predict fundamental properties like critical temperature, pressure, and acentric factor, which then feed into rigorous thermodynamic models. The Peng-Robinson equation of state simulates vapor compression cycles, NASA polynomials calculate real thermodynamic properties, and chemical kinetics estimate environmental impact.
- Reinforcement learning guides the generator toward molecules that simultaneously balance efficiency, ideal critical temperature, appropriate size, low greenhouse warming potential, and safety. All of these predicted properties are then used to simulate the candidate's efficiency in standard refrigeration equipment.
A diversity reward prevents the model from repeatedly generating the same molecules. This hybrid approach works with limited data by embedding thermodynamic knowledge into the AI system, enabling reliable predictions beyond the training distribution.
Discovering Novel Refrigerants
After generating over 1 million molecules and applying filtering according to expert thresholds, RefGen identified more than 800 viable candidates, a dramatic increase over previous efforts.
Many candidates show competitive performance with R-410A, one of today's most efficient refrigerants, while achieving a hundred-fold reduction in global warming potential. Remarkably, the model correctly discovered fundamental thermodynamic tradeoffs without being explicitly taught them, validating that it learned underlying physics, not just memorized patterns.
RefGen also successfully explored unexpected chemical spaces. Novel nitrogen-fluorine compounds showed excellent properties, representing genuinely new refrigerant classes not found in existing databases.
From Discovery to Experience
The critical next step is experimental validation: we're actively seeking collaborations with chemists and heating, ventilation, and air conditioning (HVAC) engineers to synthesize and test top candidates in real refrigeration systems.
Future work will integrate additional reward functions for molecular stability, synthesizability, and toxicity, properties essential for practical deployment.
The path from computational discovery to commercial use takes years, but RefGen dramatically accelerates the discovery phase, providing researchers with hundreds of optimized candidates instead of months of manual screening.
With cooling demands rising alongside global temperatures, RefGen demonstrates how AI can extend scientific understanding by embedding physics into machine learning to accelerate materials discovery, one molecule at a time.