From July 21 to July 27, 2024, Mila researchers will attend the Forty-first International Conference on Machine Learning (ICML 2024) in Vienna, Austria. This year, they will share 55 scientific papers at the main conference, showcasing their groundbreaking artificial intelligence (AI) research to peers from all around the world.
Here is a list of papers accepted at the ICML 2024 conference that contain at least one Mila-affiliated author :
Main Conference
Randomized Confidence Bounds for Stochastic Partial Monitoring : Maxime Heuillet, Ola Ahmad, Audrey Durand
Implicit meta-learning may lead language models to trust more reliable sources : Dmitrii Krasheninnikov, Egor Krasheninnikov, Bruno Mlodozeniec, Tegan Maharaj, David Krueger
EiG-Search: Generating Edge-Induced Subgraphs for GNN Explanation in Linear Time : Shengyao Lu, Bang Liu, Keith G Mills, Jiao He, Di Niu
Successor Features for Efficient Multi-Subject Controlled Text Generation : Meng Cao, Mehdi Fatemi, Jackie C. K. Cheung, Samira Shabanian
Interacting Diffusion Processes for Event Sequence Forecasting : Mai Zeng, Florence Regol, Mark Coates
Stealing part of a production language model : Nicholas Carlini, Daniel Paleka, Krishnamurthy Dj Dvijotham, Thomas Steinke, Jonathan Hayase, A. Feder Cooper, Katherine Lee, Matthew Jagielski, Milad Nasr, Arthur Conmy, Eric Wallace, David Rolnick, Florian Tramèr
WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks? : Alexandre Drouin, Maxime Gasse, Massimo Caccia, Issam Hadj Laradji, Manuel Del Verme, Tom Marty, Léo Boisvert, Megh Thakkar, Quentin Cappart, David Vazquez, Nicolas Chapados, Alexandre Lacoste
Position Paper: On the Societal Impact of Open Foundation Models : Sayash Kapoor, Rishi Bommasani, Kevin Klyman, Shayne Longpre, Ashwin Ramaswami, Peter Cihon, Aspen K Hopkins, Kevin Bankston, Stella Biderman, Miranda Bogen, Rumman Chowdhury, Alex Engler, Peter Henderson, Yacine Jernite, Seth Lazar, Stefano Maffulli, Alondra Nelson, Joelle Pineau, Aviya Skowron, Dawn Song, Victor Storchan, Daniel Zhang, Daniel E. Ho, Percy Liang, Arvind Narayanan
Layerwise Proximal Replay: A Proximal Point Method for Online Continual Learning : Jinsoo Yoo, Yunpeng Liu, Frank Wood, Geoff Pleiss
Information Complexity of Stochastic Convex Optimization: Applications to Generalization and Memorization : Idan Attias, Gintare Karolina Dziugaite, MAHDI HAGHIFAM, Roi Livni, Daniel M. Roy
A Persuasive Approach to Combating Misinformation : Safwan Hossain, Andjela Mladenovic, Yiling Chen, Gauthier Gidel
High-Probability Convergence for Composite and Distributed Stochastic Minimization and Variational Inequalities with Heavy-Tailed Noise : Eduard Gorbunov, Abdurakhmon Sadiev, Marina Danilova, Samuel Horváth, Gauthier Gidel, Pavel Dvurechensky, Alexander Gasnikov, Peter Richtárik
Experts Don't Cheat: Learning What You Don't Know By Predicting Pairs : Daniel D. Johnson, Danny Tarlow, David Duvenaud, Chris J. Maddison
Faithfulness Measurable Masked Language Models : Andreas Madsen, Siva Reddy, Sarath Chandar
Listenable Maps for Audio Classifiers : Francesco Paissan, Mirco Ravanelli, Cem Subakan
Lookbehind-SAM: k steps back, 1 step forward : Goncalo Mordido, Pranshu Malviya, Aristide Baratin, Sarath Chandar
Position: Application-Driven Innovation in Machine Learning : David Rolnick, Alan Aspuru-Guzik, Sara Beery, Bistra Dilkina, Priya L. Donti, Marzyeh Ghassemi, Hannah Kerner, Claire Monteleoni, Esther Rolf, Milind Tambe, Adam White
Code as Reward: Empowering Reinforcement Learning with VLMs : David Venuto, Mohammad Sami Nur Islam, Martin Klissarov, Doina Precup, Sherry Yang, Ankit Anand
No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths : Charles Guille-Escuret, Hiroki Naganuma, Kilian FATRAS, Ioannis Mitliagkas
Don't be so negative! Score-based Generative Modeling with Oracle-assisted Guidance : Saeid Naderiparizi, Xiaoxuan Liang, Setareh Cohan, Berend Zwartsenberg, Frank Wood
Modeling Caption Diversity in Contrastive Vision-Language Pretraining : Samuel Lavoie, Polina Kirichenko, Mark Ibrahim, Mahmoud Assran, Andrew Gordon Wilson, Aaron Courville, Nicolas Ballas
Nash Learning from Human Feedback : Remi Munos, Michal Valko, Daniele Calandriello, Mohammad Gheshlaghi Azar, Mark Rowland, Zhaohan Daniel Guo, Yunhao Tang, Matthieu Geist, Thomas Mesnard, Côme Fiegel, Andrea Michi, Marco Selvi, Sertan Girgin, Nikola Momchev, Olivier Bachem, Daniel J Mankowitz, Doina Precup, Bilal Piot
Do Transformer World Models Give Better Policy Gradients? : Michel Ma, Tianwei Ni, Clement Gehring, Pierluca D'Oro, Pierre-luc Bacon
Position: Cracking the Code of Cascading Disparity Towards Marginalized Communities : Golnoosh Farnadi, Mohammad Havaei, Negar Rostamzadeh
Iterated Denoising Energy Matching for Sampling from Boltzmann Densities : Tara Akhound-Sadegh, Jarrid Rector-Brooks, Joey Bose, Sarthak Mittal, Pablo Lemos, Cheng-Hao Liu, Marcin Sendera, Siamak Ravanbakhsh, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Alexander Tong
Towards Modular LLMs by Building and Reusing a Library of LoRAs : Oleksiy Ostapenko, Zhan Su, Edoardo Ponti, Laurent Charlin, Nicolas Le Roux, Matheus Pereira, Lucas Caccia, Alessandro Sordoni
Unsupervised Concept Discovery Mitigates Spurious Correlations : Md Rifat Arefin, Yan Zhang, Aristide Baratin, Francesco Locatello, Irina Rish, Dianbo Liu, Kenji Kawaguchi
Discovering environments with XRM : Mohammad Pezeshki, Diane Bouchacourt, Mark Ibrahim, Nicolas Ballas, Pascal Vincent, David Lopez-Paz
Beyond the Norms: Detecting Prediction Errors in Regression Models : Andres Altieri, Marco Romanelli, Georg Pichler, Florence Alberge, Pablo Piantanida
In value-based deep reinforcement learning, a pruned network is a good network : Johan Samir Obando Ceron, Aaron Courville, Pablo Samuel Castro
Learning to Reach Goals via Diffusion : Vineet Jain, Siamak Ravanbakhsh
Nearest Neighbour Score Estimators for Diffusion Generative Models : Matthew Niedoba, Dylan Green, Saeid Naderiparizi, Vasileios Lioutas, Jonathan Wilder Lavington, Xiaoxuan Liang, Yunpeng Liu, Ke Zhang, Setareh Dabiri, Adam Ścibior, Berend Zwartsenberg, Frank Wood
On PI Controllers for Updating Lagrange Multipliers in Constrained Optimization : Motahareh Sohrabi, Juan Ramirez, Tianyue H. Zhang, Simon Lacoste-Julien, Jose Gallego-Posada
Robust Data-driven Prescriptiveness Optimization : Mehran Poursoltani, Érick Delage, Angelos Georghiou
Stop Regressing: Training Value Functions via Classification for Scalable Deep RL : Jesse Farebrother, Jordi Orbay, Quan Vuong, Adrien Ali Taiga, Yevgen Chebotar, Ted Xiao, Alex Irpan, Sergey Levine, Pablo Samuel Castro, Aleksandra Faust, Aviral Kumar, Rishabh Agarwal
A Distributional Analogue to the Successor Representation : Harley Wiltzer, Jesse Farebrother, Arthur Gretton, Yunhao Tang, Andre Barreto, Will Dabney, Marc G. Bellemare, Mark Rowland
CKGConv: General Graph Convolution with Continuous Kernels : Liheng Ma, Soumyasundar Pal, Yitian Zhang, Jiaming Zhou, Yingxue Zhang, Mark Coates
Stochastic positional embeddings improve masked image modeling : Amir Bar, Florian Bordes, Assaf Shocher, Mahmoud Assran, Pascal Vincent, Nicolas Ballas, Trevor Darrell, Amir Globerson, Yann LeCun
Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis : Stefan Horoi, Albert Manuel Orozco Camacho, Eugene Belilovsky, Guy Wolf
Learning to Scale Logits for Temperature-Conditional GFlowNets : Minsu Kim, Joohwan Ko, Dinghuai Zhang, Ling Pan, Taeyoung Yun, Woo Chang Kim, Jinkyoo Park, Emmanuel Bengio, Yoshua Bengio
All-in-one simulation-based inference : Manuel Gloeckler, Michael Deistler, Christian Dietrich Weilbach, Frank Wood, Jakob H. Macke
Autoformalizing Euclidean Geometry : Logan Murphy, Kaiyu Yang, Jialiang Sun, Zhaoyu Li, Anima Anandkumar, Xujie Si
Graph Positional and Structural Encoder : Charles Guille-Escuret, Hiroki Naganuma, Kilian FATRAS, Ioannis Mitliagkas
Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features : Aleksandr Beznosikov, David Dobre, Gauthier Gidel
Mixtures of Experts Unlock Parameter Scaling for Deep RL : Johan Samir Obando Ceron, Ghada Sokar, Timon Willi, Clare Lyle, Jesse Farebrother, Jakob Nicolaus Foerster, Gintare Karolina Dziugaite, Doina Precup, Pablo Samuel Castro
SelfIE: Self-Interpretation of Large Language Model Embeddings : Haozhe Chen, Carl Vondrick, Chengzhi Mao
Learning to Play Atari in a World of Tokens : Pranav Agarwal, Sheldon Andrews, Samira Ebrahimi Kahou
Memory Efficient Neural Processes via Constant Memory Attention Block : Leo Feng, Frederick Tung, Hossein Hajimirsadeghi, Yoshua Bengio, Mohamed Osama Ahmed
Adaptive Accompaniment with ReaLchords : Yusong Wu, Tim Cooijmans, Kyle Kastner, Adam Roberts, Ian Simon, Alexander Scarlatos, Chris Donahue, Cassie Tarakajian, Shayegan Omidshafiei, Aaron Courville, Pablo Samuel Castro, Natasha Jaques, Cheng-Zhi Anna Huang
Improving Gradient-Guided Nested Sampling for Posterior Inference : Pablo Lemos, Will Handley, Nikolay Malkin, Yoshua Bengio, Yashar Hezaveh, Laurence Perreault-Levasseur
A Tensor Decomposition Perspective on Second-order RNNs : Maude Lizaire, Michael Rizvi-Martel, Marawan Gamal, Guillaume Rabusseau
PcLast: Discovering Plannable Continuous Latent States : Anurag Koul, Shivakanth Sujit, Shaoru Chen, Ben Evans, Lili Wu, Byron Xu, Rajan Chari, Riashat Islam, Raihan Seraj, Yonathan Efroni, Lekan Molu, Miro Dudik, John Langford, Alex Lamb
A Computational Framework for Solving Wasserstein Lagrangian Flows : Kirill Neklyudov, Rob Brekelmans, Alexander Tong, Lazar Atanackovic, qiang liu, Alireza Makhzani
Estimating Unknown Population Sizes Using the Hypergeometric Distribution : Liam Hodgson, Danilo Bzdok
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue : Xing Han Lù, Zdeněk Kasner, Siva Reddy
Workshops
- Learning Generative Population Models From Multiple Clinical Datasets Via Probabilistic Programming : João Loula, Katherine M. Collins, Ulrich Schaechtle, Joshua B. Tenenbaum, Adrian Weller, Feras Saad, Timothy J. O’Donnell, Vikash Mansinghka
- The Butterfly Effect: Tiny Perturbations Cause Neural Network Training to Diverge : Gül Sena Altıntaş, Devin Kwok, David Rolnick
- Linear Weight Interpolation Leads to Transient Performance Gains: Gaurav Iyer, Gintare Karolina Dziugaite, David Rolnick
- Expressivity of Neural Networks with Fixed Weights and Learned Biases : Ezekiel Williams, Avery Hee-Woon Ryoo, Thomas Jiralerspong, Alexandre Payeur, Matthew Perich, Luca Mazzucato, Guillaume Lajoie
- Gradient Dissent in Language Model Training and Saturation : Andrei Mircea, Ekaterina Lobacheva, Irina Rish
- Gradient descent induces alignment between weights and the pre-activation tangents for deep non-linear networks : Daniel Beaglehole, Ioannis Mitliagkas, Atish Agarwala
- AI-Assisted Generation of Difficult Math Questions : Vedant Shah, Dingli Yu, Kaifeng Lyu, Simon Park, Nan Rosemary Ke, Michael Mozer, Yoshua Bengio, Sanjeev Arora, Anirudh Goyal
- Exploring Scaling Trends in LLM Robustness : Nikolaus H. R. Howe, Michał Zając, Ian R. McKenzie, Oskar John Hollinsworth, Pierre-Luc Bacon, Adam Gleave
- Revisiting Successor Features for Inverse Reinforcement Learning : Arnav Kumar Jain, Harley Wiltzer, Jesse Farebrother, Irina Rish, Glen Berseth, Sanjiban Choudhury
- Interpretability in Action: Exploratory Analysis of VPT, a Minecraft Agent : Karolis Jucys, George Adamopoulos, Mehrab Hamidi, Stephanie Milani, Mohammad Reza Samsami, Artem Zholus, Sonia Joseph, Blake Richards, Irina Rish, and Ozgur Simsek
- Learning to Design Data-structures: A Case Study of Nearest Neighbor Search : Omar Salemohamed, Vatsal Sharan, Shivam Garg, Laurent Charlin, Gregory Valiant
- Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold : Lazar Atanackovic, Xi Zhang, Brandon Amos, Mathieu Blanchette, Leo J Lee, Yoshua Bengio, Alexander Tong, Kirill Neklyudov
- Fine-tuned network relies on generic representation to solve unseen cognitive task : Dongyan Lin
- Evaluating the transferability potential of deep learning models for climate
downscaling : Ayush Prasad, Paula Harder, Qidong Yang, Prasanna Sattegeri, Daniela Szwarcman, Campbell Watson, David Rolnick - On provable length and compositional generalization : Kartik Ahuja, Amin Mansouri
- Cell Morphology-Guided Small Molecule Generation with GFlowNets : Stephen Zhewen Lu, Ziqing Lu, Ehsan Hajiramezanali, Tommaso Biancalani, Yoshua Bengio, Gabriele Scalia, Michał Koziarski
- Changepoint Detection in Highly-Attributed Dynamic Graphs : Emiliano Penaloza, Nathaniel Stevens
- Controlling Large Language Model Agents with Entropic Activation Steering : Nate Rahn, Pierluca D'Oro, Marc G. Bellemare
- Improving Molecular Modeling with Geometric GNNs: an Empirical Study : Ali Ramlaoui, Théo Saulus, Basile Terver, Victor Schmidt, David Rolnick, Fragkiskos Malliaros, Alexandre Duval
- Doob's Lagrangian: A Sample-Efficient Variational Approach to Transition Path Sampling : Yuanqi Du, Michael Plainer, Rob Brekelmans, Chenru Duan, Frank Noe, Carla P Gomes, Alan Aspuru-Guzik, Kirill Neklyudov
- Randomized Confidence Bounds for Stochastic Partial Monitoring : Maxime Heuillet, Ola Ahmad, Audrey Durand
- Lost in Translation: The Algorithmic Gap Between LMs and the Brain : Tommaso Tosato, Pascal Junior Tikeng Notsawo, Saskia Helbling, Irina Rish, Guillaume Dumas
- Realtime Reinforcement Learning: Towards Rapid Asynchronous Deployment of Large Models : Gopeshh Subbaraj, Matthew Riemer, Glen Berseth, Irina Rish
- Performance Control in Early Exiting to Deploy Large Models at the Same Cost of Smaller Ones : Mehrnaz Mofakhami, Reza Bayat, Ioannis Mitliagkas, Joao Monteiro, Valentina Zantedeschi
- VFA: Vision Frequency Analysis of Foundation Models and Human : Mohammad-Javad Darvishi-Bayazi, Md Rifat Arefin, Jocelyn Faubert, Irina Rish
- Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques : Rishika Bhagwatkar, Shravan Nayak, Reza Bayat, Alexis Roger, Daniel Z Kaplan, Pouya Bashivan, Irina Rish
- Handling Delay in Reinforcement Learning Caused by Parallel Computations of Neurons : Ivan Anokhin, Rishav, Stephen Chung, Irina Rish, Samira Ebrahimi Kahou
- Crystal-GFN: sampling crystals with desirable properties and constraints : Mila Ai4Science, Alex Hernández-García, Alexandre Duval, Alexandra Volokhova, Yoshua Bengio, Divya Sharma, Pierre Luc Carrier, Yasmine Benabed, Michał Koziarski, Victor Schmidt, Gian-Marco Rignanese, Pierre-Paul De Breuck, Paulette Clancy
- Multimodal foundation world models for generalist embodied agents : Pietro Mazzaglia, Tim Verbelen, Bart Dhoedt, Aaron Courville, Sai Rajeswar
- QGFN: Controllable Greediness with Action Values : Elaine Lau, Stephen Zhewen Lu, Ling Pan, Doina Precup, Emmanuel Bengio
- MiniMol: A Parameter-Efficient Foundation Model for Molecular Learning : Kerstin Kläser, Błażej Banaszewski, Samuel Maddrell-Mander, Callum McLean, Luis Müller, Ali Parviz, Shenyang Huang, Andrew Fitzgibbon
- iWISDM: Assessing instruction following in multimodal models at scale : Xiaoxuan Lei, Lucas Gomez, Hao Yuan Bai, Pouya Bashivan
- Many-Shot In-Context Learning : Rishabh Agarwal, Avi Singh, Lei M. Zhang, Bernd Bohnet, Stephanie Chan, Ankesh Anand, Zaheer Abbas, Azade Nova, John D. Co-Reyes, Eric Chu, Feryal M. P. Behbahani, Aleksandra Faust, Hugo Larochelle
- Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving : Aniket Rajiv Didolkar, Anirudh Goyal, Nan Rosemary Ke, Siyuan Guo, Michal Valko, Timothy Lillicrap, Danilo Rezende, Yoshua Bengio, Michael Curtis Mozer, Sanjeev Arora