Publications

The World Health Organization as an engine of ideational robustness

Jean-Louis Denis

Gaelle Foucault

Pierre Larouche

Catherine Régis

Miriam Cohen

Marie-Andree Girard

2024-03-05

Policy and Society (published)

doi.org

Enhancing and Evaluating Logical Reasoning Abilities of Large Language Models

Shujie Deng

Honghua Dong

Xujie Si

2024-03-04

ICLR.cc/2024/Workshop/SeT_LLM (published)

openreview.net

A Generative Model of Symmetry Transformations

James U. Allingham

Bruno Mlodozeniec

Shreyas Padhy

Javier Antor'an

David Scott Krueger

Richard E. Turner

Eric T. Nalisnick

Jos'e Miguel Hern'andez-Lobato

Correctly capturing the symmetry transformations of data can lead to efficient models with strong generalization capabilities, though method… (see more)s incorporating symmetries often require prior knowledge. While recent advancements have been made in learning those symmetries directly from the dataset, most of this work has focused on the discriminative setting. In this paper, we construct a generative model that explicitly aims to capture symmetries in the data, resulting in a model that learns which symmetries are present in an interpretable way. We provide a simple algorithm for efficiently learning our generative model and demonstrate its ability to capture symmetries under affine and color transformations. Combining our symmetry model with existing generative models results in higher marginal test-log-likelihoods and robustness to data sparsification.

2024-03-04

ArXiv (preprint)

doi.org

arxiv.org

MagicClay: Sculpting Meshes With Generative Neural Fields

Amir Barda

Vladimir Kim

Noam Aigerman

Amit H. Bermano

Thibault Groueix

The recent developments in neural fields have brought phenomenal capabilities to the field of shape generation, but they lack crucial proper… (see more)ties, such as incremental control - a fundamental requirement for artistic work. Triangular meshes, on the other hand, are the representation of choice for most geometry related tasks, offering efficiency and intuitive control, but do not lend themselves to neural optimization. To support downstream tasks, previous art typically proposes a two-step approach, where first a shape is generated using neural fields, and then a mesh is extracted for further processing. Instead, in this paper we introduce a hybrid approach that maintains both a mesh and a Signed Distance Field (SDF) representations consistently. Using this representation, we introduce MagicClay - an artist friendly tool for sculpting regions of a mesh according to textual prompts while keeping other regions untouched. Our framework carefully and efficiently balances consistency between the representations and regularizations in every step of the shape optimization; Relying on the mesh representation, we show how to render the SDF at higher resolutions and faster. In addition, we employ recent work in differentiable mesh reconstruction to adaptively allocate triangles in the mesh where required, as indicated by the SDF. Using an implemented prototype, we demonstrate superior generated geometry compared to the state-of-the-art, and novel consistent control, allowing sequential prompt-based edits to the same mesh for the first time.

2024-03-04

ArXiv (preprint)

doi.org

arxiv.org

Predicting Grokking Long Before it Happens: A look into the loss landscape of models which grok

Tikeng Notsawo Pascal Junior

Pascal Notsawo

Hattie Zhou

Mohammad Pezeshki

Irina Rish

Guillaume Dumas

2024-03-04

ICLR.cc/2024/Workshop/ME-FoMo (poster)

doi.org

openreview.net

Self-evaluation and self-prompting to improve the reliability of LLMs

Alexandre Piché

Aristides Milios

Dzmitry Bahdanau

Chris Pal

In order to safely deploy Large Language Models (LLMs), they must be capable of dynamically adapting their behavior based on their level of … (see more)knowledge and uncertainty associated with specific topics. This adaptive behavior, which we refer to as self-restraint, is non-trivial to teach since it depends on the internal knowledge of an LLM. By default, LLMs are trained to maximize the next token likelihood which does not teach the model to modulate its answer based on its level of uncertainty. In order to learn self-restraint, we devise a simple objective that can encourage the model to produce generation that the model is confident in. To optimize this objective, we introduce ReSearch, an iterative search algorithm based on self-evaluation and self-prompting. Our method results in fewer hallucinations overall, both for known and unknown topics, as the model learns to selectively restrain itself. In addition, our method elegantly incorporates the ability to decline, when the model assesses that it cannot provide a response without a high proportion of hallucination.

2024-03-04

ICLR.cc/2024/Workshop/SeT_LLM (published)

openreview.net

Self-evaluation and self-prompting to improve the reliability of LLMs

Alexandre Piché

Aristides Milios

Dzmitry Bahdanau

Chris Pal

In order to safely deploy Large Language Models (LLMs), they must be capable of dynamically adapting their behavior based on their level of … (see more)knowledge and uncertainty associated with specific topics. This adaptive behavior, which we refer to as self-restraint, is non-trivial to teach since it depends on the internal knowledge of an LLM. By default, LLMs are trained to maximize the next token likelihood which does not teach the model to modulate its answer based on its level of uncertainty. In order to learn self-restraint, we devise a simple objective that can encourage the model to produce generation that the model is confident in. To optimize this objective, we introduce ReSearch, an iterative search algorithm based on self-evaluation and self-prompting. Our method results in fewer hallucinations overall, both for known and unknown topics, as the model learns to selectively restrain itself. In addition, our method elegantly incorporates the ability to decline, when the model assesses that it cannot provide a response without a high proportion of hallucination.

2024-03-04

ICLR.cc/2024/Workshop/SeT_LLM (published)

openreview.net

On the Scalability of GNNs for Molecular Graphs

Maciej Sypetkowski

Frederik Wenkel

Farimah Poursafaei

Nia Dickson

Karush Suri

Philip Fradkin

Dominique Beaini

Scaling deep learning models has been at the heart of recent revolutions in language modelling and image generation. Practitioners have obse… (see more)rved a strong relationship between model size, dataset size, and performance. However, structure-based architectures such as Graph Neural Networks (GNNs) are yet to show the benefits of scale mainly due to the lower efficiency of sparse operations, large data requirements, and lack of clarity about the effectiveness of various architectures. We address this drawback of GNNs by studying their scaling behavior. Specifically, we analyze message-passing networks, graph Transformers, and hybrid architectures on the largest public collection of 2D molecular graphs. For the first time, we observe that GNNs benefit tremendously from the increasing scale of depth, width, number of molecules, number of labels, and the diversity in the pretraining datasets, resulting in a 30.25% improvement when scaling to 1 billion parameters and 28.98% improvement when increasing size of dataset to eightfold. We further demonstrate strong finetuning scaling behavior on 38 tasks, outclassing previous large models. We hope that our work paves the way for an era where foundational GNNs drive pharmaceutical drug discovery.

2024-03-04

ICLR.cc/2024/Workshop/DPFM (poster)

openreview.net

Towards DNA-Encoded Library Generation with GFlowNets

Michał Koziarski

Mohammed Abukalam

Vedant Shah

Louis Vaillancourt

Doris Alexandra Schuetz

Moksh J. Jain

Almer M. van der Sloot

Mathieu Bourgey

Anne Marinier

Yoshua Bengio

2024-03-04

ICLR.cc/2024/Workshop/GEM (poster)

doi.org

openreview.net

Communicating Study Design Trade-offs in Software Engineering

Martin P. Robillard

Deeksha M. Arya

Neil A. Ernst

Jin Guo

Maxime Lamothe

Mathieu Nassif

Nicole Novielli

Alexander Serebrenik

Igor Steinmacher

Klaas-Jan Stol

2024-03-02

ACM Transactions on Software Engineering and Methodology (published)

doi.org

A Compositional Typed Semantics for Universal Dependencies

Laurestine Bradford

Timothy John O'donnell

Siva Reddy

2024-03-02

ArXiv (preprint)

doi.org

arxiv.org

Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis

Stefan Horoi

Albert Manuel Orozco Camacho

Eugene Belilovsky

Guy Wolf

Ensembling multiple models enhances predictive performance by utilizing the varied learned features of the different models but incurs signi… (see more)ficant computational and storage costs. Model fusion, which combines parameters from multiple models into one, aims to mitigate these costs but faces practical challenges due to the complex, non-convex nature of neural network loss landscapes, where learned minima are often separated by high loss barriers. Recent works have explored using permutations to align network features, reducing the loss barrier in parameter space. However, permutations are restrictive since they assume a one-to-one mapping between the different models' neurons exists. We propose a new model merging algorithm, CCA Merge, which is based on Canonical Correlation Analysis and aims to maximize the correlations between linear combinations of the model features. We show that our method of aligning models leads to better performances than past methods when averaging models trained on the same, or differing data splits. We also extend this analysis into the harder many models setting where more than 2 models are merged, and we find that CCA Merge works significantly better in this setting than past methods.

2024-03-02

ICLR.cc/2024/Workshop/Re-Align (poster)

openreview.net

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Publications

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Popular keywords:

Publications