Mila > News > GFlowNets: AI at the service of scientific discovery

20 Dec 2023

GFlowNets: AI at the service of scientific discovery

From November 8 to 10, 2023, around 100 researchers from multiple disciplines gathered at Mila for the first workshop on Generative Flow Networks (GFlowNet Workshop), a novel machine learning method for generating objects sequentially to accelerate scientific research and improve the transparency of artificial intelligence (AI) models.

Videos of the event are available on Mila’s YouTube channel.

Accelerating scientific discovery

This new machine learning method, put forward in 2021 in a research paper by Emmanuel Bengio titled Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation, has since been explored by several researchers at Mila and other labs around the world.

Tristan Deleu, a PhD student under the supervision of Yoshua Bengio and one of the workshop organizers, recently presented his paper Joint Bayesian Inference of Graphical Structure and Parameters with a Single Generative Flow Network at the NeurIPS 2023 main conference.

He explains that GFlowNets have many potential applications, from the discovery of molecules, drugs and materials, to biological sequences and language models. All these fields of research involve combinatorial problems with a vast range of potential solutions, which GFlowNets make easier to navigate.

Tristan Deleu takes the study of interactions between human genes (Gene Regulatory Networks) as an example to illustrate these problems: “With the human genome comprising 20,000 genes, the number of all potential causal relationships would be far greater than the combined number of all atoms in the universe!”

Generating molecules

GFlowNets enable the generation -from scratch and atom by atom- of certain molecules over others according to certain criteria, thus accelerating the process of scientific discovery. 

“With GFlowNets, we get this generative way of constructing various types of objects with similar properties -molecules, for instance- and each of these molecules has an associated reward, for example its relevance in curing a certain disease,” Tristan Deleu explains.

Traditionally, reinforcement learning (RL) methods seek to generate the best possible solution, i.e., the one that maximizes the chances of associated rewards.

But in the case of molecule generation for scientific discovery, this process risks favouring molecules whose properties are already known.

“What we’re interested in are other molecules that are at least as good, but which aren’t really the ones we discovered in the first place. GFlowNets enable us to find this broad diversity in the molecules we can generate,” Tristan Deleu explains.

Making AI models more transparent

In his view, the success of the workshop hosted at Mila is proof of the relevance of this method to scientific research at large: half of the researchers who attended it were specialized in machine learning, while the other half were scientists from other fields like biology, physics and chemistry.

Unlike the traditional approach to machine learning (a model making predictions from data and then making a decision), GFlowNets make the decision-making process more explicable and thus more transparent, which is crucial in scientific research, but also in the case of AI applications in critical fields.

“The probabilistic approach behind GFlowNets enables us to take uncertainty into account in the decisions we make, which is important in the case of safety-critical applications,” Tristan Deleu explains.

For instance, rather than making an arbitrary decision without explaining it, an autonomous car could disclose its percentage of uncertainty in relation to a potential direction, thus preserving human control over the final decision.

“It’s important to move on from predictive models to more probabilistic models that include this notion of uncertainty in order to promote the safety of AI systems,” Tristan Deleu concludes.