 
  From July 21 to July 27, 2024, Mila researchers will attend the Forty-first International Conference on Machine Learning (ICML 2024) in Vienna, Austria. This year, they will share 55 scientific papers at the main conference, showcasing their groundbreaking artificial intelligence (AI) research to peers from all around the world.
Here is a list of papers accepted at the ICML 2024 conference that contain at least one Mila-affiliated author :
Main Conference
- Randomized Confidence Bounds for Stochastic Partial Monitoring : Maxime Heuillet, Ola Ahmad, Audrey Durand 
- Implicit meta-learning may lead language models to trust more reliable sources : Dmitrii Krasheninnikov, Egor Krasheninnikov, Bruno Mlodozeniec, Tegan Maharaj, David Krueger 
- EiG-Search: Generating Edge-Induced Subgraphs for GNN Explanation in Linear Time : Shengyao Lu, Bang Liu, Keith G Mills, Jiao He, Di Niu 
- Successor Features for Efficient Multi-Subject Controlled Text Generation : Meng Cao, Mehdi Fatemi, Jackie C. K. Cheung, Samira Shabanian 
- Interacting Diffusion Processes for Event Sequence Forecasting : Mai Zeng, Florence Regol, Mark Coates 
- Stealing part of a production language model : Nicholas Carlini, Daniel Paleka, Krishnamurthy Dj Dvijotham, Thomas Steinke, Jonathan Hayase, A. Feder Cooper, Katherine Lee, Matthew Jagielski, Milad Nasr, Arthur Conmy, Eric Wallace, David Rolnick, Florian Tramèr 
- WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks? : Alexandre Drouin, Maxime Gasse, Massimo Caccia, Issam Hadj Laradji, Manuel Del Verme, Tom Marty, Léo Boisvert, Megh Thakkar, Quentin Cappart, David Vazquez, Nicolas Chapados, Alexandre Lacoste 
- Position Paper: On the Societal Impact of Open Foundation Models : Sayash Kapoor, Rishi Bommasani, Kevin Klyman, Shayne Longpre, Ashwin Ramaswami, Peter Cihon, Aspen K Hopkins, Kevin Bankston, Stella Biderman, Miranda Bogen, Rumman Chowdhury, Alex Engler, Peter Henderson, Yacine Jernite, Seth Lazar, Stefano Maffulli, Alondra Nelson, Joelle Pineau, Aviya Skowron, Dawn Song, Victor Storchan, Daniel Zhang, Daniel E. Ho, Percy Liang, Arvind Narayanan 
- Layerwise Proximal Replay: A Proximal Point Method for Online Continual Learning : Jinsoo Yoo, Yunpeng Liu, Frank Wood, Geoff Pleiss 
- Information Complexity of Stochastic Convex Optimization: Applications to Generalization and Memorization : Idan Attias, Gintare Karolina Dziugaite, MAHDI HAGHIFAM, Roi Livni, Daniel M. Roy 
- A Persuasive Approach to Combating Misinformation : Safwan Hossain, Andjela Mladenovic, Yiling Chen, Gauthier Gidel 
- High-Probability Convergence for Composite and Distributed Stochastic Minimization and Variational Inequalities with Heavy-Tailed Noise : Eduard Gorbunov, Abdurakhmon Sadiev, Marina Danilova, Samuel Horváth, Gauthier Gidel, Pavel Dvurechensky, Alexander Gasnikov, Peter Richtárik 
- Experts Don't Cheat: Learning What You Don't Know By Predicting Pairs : Daniel D. Johnson, Danny Tarlow, David Duvenaud, Chris J. Maddison 
- Faithfulness Measurable Masked Language Models : Andreas Madsen, Siva Reddy, Sarath Chandar 
- Listenable Maps for Audio Classifiers : Francesco Paissan, Mirco Ravanelli, Cem Subakan 
- Lookbehind-SAM: k steps back, 1 step forward : Goncalo Mordido, Pranshu Malviya, Aristide Baratin, Sarath Chandar 
- Position: Application-Driven Innovation in Machine Learning : David Rolnick, Alan Aspuru-Guzik, Sara Beery, Bistra Dilkina, Priya L. Donti, Marzyeh Ghassemi, Hannah Kerner, Claire Monteleoni, Esther Rolf, Milind Tambe, Adam White 
- Code as Reward: Empowering Reinforcement Learning with VLMs : David Venuto, Mohammad Sami Nur Islam, Martin Klissarov, Doina Precup, Sherry Yang, Ankit Anand 
- No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths : Charles Guille-Escuret, Hiroki Naganuma, Kilian FATRAS, Ioannis Mitliagkas 
- Don't be so negative! Score-based Generative Modeling with Oracle-assisted Guidance : Saeid Naderiparizi, Xiaoxuan Liang, Setareh Cohan, Berend Zwartsenberg, Frank Wood 
- Modeling Caption Diversity in Contrastive Vision-Language Pretraining : Samuel Lavoie, Polina Kirichenko, Mark Ibrahim, Mahmoud Assran, Andrew Gordon Wilson, Aaron Courville, Nicolas Ballas 
- Nash Learning from Human Feedback : Remi Munos, Michal Valko, Daniele Calandriello, Mohammad Gheshlaghi Azar, Mark Rowland, Zhaohan Daniel Guo, Yunhao Tang, Matthieu Geist, Thomas Mesnard, Côme Fiegel, Andrea Michi, Marco Selvi, Sertan Girgin, Nikola Momchev, Olivier Bachem, Daniel J Mankowitz, Doina Precup, Bilal Piot 
- Do Transformer World Models Give Better Policy Gradients? : Michel Ma, Tianwei Ni, Clement Gehring, Pierluca D'Oro, Pierre-luc Bacon 
- Position: Cracking the Code of Cascading Disparity Towards Marginalized Communities : Golnoosh Farnadi, Mohammad Havaei, Negar Rostamzadeh 
- Iterated Denoising Energy Matching for Sampling from Boltzmann Densities : Tara Akhound-Sadegh, Jarrid Rector-Brooks, Joey Bose, Sarthak Mittal, Pablo Lemos, Cheng-Hao Liu, Marcin Sendera, Siamak Ravanbakhsh, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Alexander Tong 
- Towards Modular LLMs by Building and Reusing a Library of LoRAs : Oleksiy Ostapenko, Zhan Su, Edoardo Ponti, Laurent Charlin, Nicolas Le Roux, Matheus Pereira, Lucas Caccia, Alessandro Sordoni 
- Unsupervised Concept Discovery Mitigates Spurious Correlations : Md Rifat Arefin, Yan Zhang, Aristide Baratin, Francesco Locatello, Irina Rish, Dianbo Liu, Kenji Kawaguchi 
- Discovering environments with XRM : Mohammad Pezeshki, Diane Bouchacourt, Mark Ibrahim, Nicolas Ballas, Pascal Vincent, David Lopez-Paz 
- Beyond the Norms: Detecting Prediction Errors in Regression Models : Andres Altieri, Marco Romanelli, Georg Pichler, Florence Alberge, Pablo Piantanida 
- In value-based deep reinforcement learning, a pruned network is a good network : Johan Samir Obando Ceron, Aaron Courville, Pablo Samuel Castro 
- Learning to Reach Goals via Diffusion : Vineet Jain, Siamak Ravanbakhsh 
- Nearest Neighbour Score Estimators for Diffusion Generative Models : Matthew Niedoba, Dylan Green, Saeid Naderiparizi, Vasileios Lioutas, Jonathan Wilder Lavington, Xiaoxuan Liang, Yunpeng Liu, Ke Zhang, Setareh Dabiri, Adam Ścibior, Berend Zwartsenberg, Frank Wood 
- On PI Controllers for Updating Lagrange Multipliers in Constrained Optimization : Motahareh Sohrabi, Juan Ramirez, Tianyue H. Zhang, Simon Lacoste-Julien, Jose Gallego-Posada 
- Robust Data-driven Prescriptiveness Optimization : Mehran Poursoltani, Érick Delage, Angelos Georghiou 
- Stop Regressing: Training Value Functions via Classification for Scalable Deep RL : Jesse Farebrother, Jordi Orbay, Quan Vuong, Adrien Ali Taiga, Yevgen Chebotar, Ted Xiao, Alex Irpan, Sergey Levine, Pablo Samuel Castro, Aleksandra Faust, Aviral Kumar, Rishabh Agarwal 
- A Distributional Analogue to the Successor Representation : Harley Wiltzer, Jesse Farebrother, Arthur Gretton, Yunhao Tang, Andre Barreto, Will Dabney, Marc G. Bellemare, Mark Rowland 
- CKGConv: General Graph Convolution with Continuous Kernels : Liheng Ma, Soumyasundar Pal, Yitian Zhang, Jiaming Zhou, Yingxue Zhang, Mark Coates 
- Stochastic positional embeddings improve masked image modeling : Amir Bar, Florian Bordes, Assaf Shocher, Mahmoud Assran, Pascal Vincent, Nicolas Ballas, Trevor Darrell, Amir Globerson, Yann LeCun 
- Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis : Stefan Horoi, Albert Manuel Orozco Camacho, Eugene Belilovsky, Guy Wolf 
- Learning to Scale Logits for Temperature-Conditional GFlowNets : Minsu Kim, Joohwan Ko, Dinghuai Zhang, Ling Pan, Taeyoung Yun, Woo Chang Kim, Jinkyoo Park, Emmanuel Bengio, Yoshua Bengio 
- All-in-one simulation-based inference : Manuel Gloeckler, Michael Deistler, Christian Dietrich Weilbach, Frank Wood, Jakob H. Macke 
- Autoformalizing Euclidean Geometry : Logan Murphy, Kaiyu Yang, Jialiang Sun, Zhaoyu Li, Anima Anandkumar, Xujie Si 
- Graph Positional and Structural Encoder : Charles Guille-Escuret, Hiroki Naganuma, Kilian FATRAS, Ioannis Mitliagkas 
- Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features : Aleksandr Beznosikov, David Dobre, Gauthier Gidel 
- Mixtures of Experts Unlock Parameter Scaling for Deep RL : Johan Samir Obando Ceron, Ghada Sokar, Timon Willi, Clare Lyle, Jesse Farebrother, Jakob Nicolaus Foerster, Gintare Karolina Dziugaite, Doina Precup, Pablo Samuel Castro 
- SelfIE: Self-Interpretation of Large Language Model Embeddings : Haozhe Chen, Carl Vondrick, Chengzhi Mao 
- Learning to Play Atari in a World of Tokens : Pranav Agarwal, Sheldon Andrews, Samira Ebrahimi Kahou 
- Memory Efficient Neural Processes via Constant Memory Attention Block : Leo Feng, Frederick Tung, Hossein Hajimirsadeghi, Yoshua Bengio, Mohamed Osama Ahmed 
- Adaptive Accompaniment with ReaLchords : Yusong Wu, Tim Cooijmans, Kyle Kastner, Adam Roberts, Ian Simon, Alexander Scarlatos, Chris Donahue, Cassie Tarakajian, Shayegan Omidshafiei, Aaron Courville, Pablo Samuel Castro, Natasha Jaques, Cheng-Zhi Anna Huang 
- Improving Gradient-Guided Nested Sampling for Posterior Inference : Pablo Lemos, Will Handley, Nikolay Malkin, Yoshua Bengio, Yashar Hezaveh, Laurence Perreault-Levasseur 
- A Tensor Decomposition Perspective on Second-order RNNs : Maude Lizaire, Michael Rizvi-Martel, Marawan Gamal, Guillaume Rabusseau 
- PcLast: Discovering Plannable Continuous Latent States : Anurag Koul, Shivakanth Sujit, Shaoru Chen, Ben Evans, Lili Wu, Byron Xu, Rajan Chari, Riashat Islam, Raihan Seraj, Yonathan Efroni, Lekan Molu, Miro Dudik, John Langford, Alex Lamb 
- A Computational Framework for Solving Wasserstein Lagrangian Flows : Kirill Neklyudov, Rob Brekelmans, Alexander Tong, Lazar Atanackovic, qiang liu, Alireza Makhzani 
- Estimating Unknown Population Sizes Using the Hypergeometric Distribution : Liam Hodgson, Danilo Bzdok 
- WebLINX: Real-World Website Navigation with Multi-Turn Dialogue : Xing Han Lù, Zdeněk Kasner, Siva Reddy 
Workshops
- Learning Generative Population Models From Multiple Clinical Datasets Via Probabilistic Programming : João Loula, Katherine M. Collins, Ulrich Schaechtle, Joshua B. Tenenbaum, Adrian Weller, Feras Saad, Timothy J. O’Donnell, Vikash Mansinghka
- The Butterfly Effect: Tiny Perturbations Cause Neural Network Training to Diverge : Gül Sena Altıntaş, Devin Kwok, David Rolnick
- Linear Weight Interpolation Leads to Transient Performance Gains: Gaurav Iyer, Gintare Karolina Dziugaite, David Rolnick
- Expressivity of Neural Networks with Fixed Weights and Learned Biases : Ezekiel Williams, Avery Hee-Woon Ryoo, Thomas Jiralerspong, Alexandre Payeur, Matthew Perich, Luca Mazzucato, Guillaume Lajoie
- Gradient Dissent in Language Model Training and Saturation : Andrei Mircea, Ekaterina Lobacheva, Irina Rish
- Gradient descent induces alignment between weights and the pre-activation tangents for deep non-linear networks : Daniel Beaglehole, Ioannis Mitliagkas, Atish Agarwala
- AI-Assisted Generation of Difficult Math Questions : Vedant Shah, Dingli Yu, Kaifeng Lyu, Simon Park, Nan Rosemary Ke, Michael Mozer, Yoshua Bengio, Sanjeev Arora, Anirudh Goyal
- Exploring Scaling Trends in LLM Robustness : Nikolaus H. R. Howe, Michał Zając, Ian R. McKenzie, Oskar John Hollinsworth, Pierre-Luc Bacon, Adam Gleave
- Revisiting Successor Features for Inverse Reinforcement Learning : Arnav Kumar Jain, Harley Wiltzer, Jesse Farebrother, Irina Rish, Glen Berseth, Sanjiban Choudhury
- Interpretability in Action: Exploratory Analysis of VPT, a Minecraft Agent : Karolis Jucys, George Adamopoulos, Mehrab Hamidi, Stephanie Milani, Mohammad Reza Samsami, Artem Zholus, Sonia Joseph, Blake Richards, Irina Rish, and Ozgur Simsek
- Learning to Design Data-structures: A Case Study of Nearest Neighbor Search : Omar Salemohamed, Vatsal Sharan, Shivam Garg, Laurent Charlin, Gregory Valiant
- Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold : Lazar Atanackovic, Xi Zhang, Brandon Amos, Mathieu Blanchette, Leo J Lee, Yoshua Bengio, Alexander Tong, Kirill Neklyudov
- Fine-tuned network relies on generic representation to solve unseen cognitive task : Dongyan Lin
- Evaluating the transferability potential of deep learning models for climate
 downscaling : Ayush Prasad, Paula Harder, Qidong Yang, Prasanna Sattegeri, Daniela Szwarcman, Campbell Watson, David Rolnick
- On provable length and compositional generalization : Kartik Ahuja, Amin Mansouri
- Cell Morphology-Guided Small Molecule Generation with GFlowNets : Stephen Zhewen Lu, Ziqing Lu, Ehsan Hajiramezanali, Tommaso Biancalani, Yoshua Bengio, Gabriele Scalia, Michał Koziarski
- Changepoint Detection in Highly-Attributed Dynamic Graphs : Emiliano Penaloza, Nathaniel Stevens
- Controlling Large Language Model Agents with Entropic Activation Steering : Nate Rahn, Pierluca D'Oro, Marc G. Bellemare
- Improving Molecular Modeling with Geometric GNNs: an Empirical Study : Ali Ramlaoui, Théo Saulus, Basile Terver, Victor Schmidt, David Rolnick, Fragkiskos Malliaros, Alexandre Duval
- Doob's Lagrangian: A Sample-Efficient Variational Approach to Transition Path Sampling : Yuanqi Du, Michael Plainer, Rob Brekelmans, Chenru Duan, Frank Noe, Carla P Gomes, Alan Aspuru-Guzik, Kirill Neklyudov
- Randomized Confidence Bounds for Stochastic Partial Monitoring : Maxime Heuillet, Ola Ahmad, Audrey Durand
- Lost in Translation: The Algorithmic Gap Between LMs and the Brain : Tommaso Tosato, Pascal Junior Tikeng Notsawo, Saskia Helbling, Irina Rish, Guillaume Dumas
- Realtime Reinforcement Learning: Towards Rapid Asynchronous Deployment of Large Models : Gopeshh Subbaraj, Matthew Riemer, Glen Berseth, Irina Rish
- Performance Control in Early Exiting to Deploy Large Models at the Same Cost of Smaller Ones : Mehrnaz Mofakhami, Reza Bayat, Ioannis Mitliagkas, Joao Monteiro, Valentina Zantedeschi
- VFA: Vision Frequency Analysis of Foundation Models and Human : Mohammad-Javad Darvishi-Bayazi, Md Rifat Arefin, Jocelyn Faubert, Irina Rish
- Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques : Rishika Bhagwatkar, Shravan Nayak, Reza Bayat, Alexis Roger, Daniel Z Kaplan, Pouya Bashivan, Irina Rish
- Handling Delay in Reinforcement Learning Caused by Parallel Computations of Neurons : Ivan Anokhin, Rishav, Stephen Chung, Irina Rish, Samira Ebrahimi Kahou
- Crystal-GFN: sampling crystals with desirable properties and constraints : Mila Ai4Science, Alex Hernández-García, Alexandre Duval, Alexandra Volokhova, Yoshua Bengio, Divya Sharma, Pierre Luc Carrier, Yasmine Benabed, Michał Koziarski, Victor Schmidt, Gian-Marco Rignanese, Pierre-Paul De Breuck, Paulette Clancy
- Multimodal foundation world models for generalist embodied agents : Pietro Mazzaglia, Tim Verbelen, Bart Dhoedt, Aaron Courville, Sai Rajeswar
- QGFN: Controllable Greediness with Action Values : Elaine Lau, Stephen Zhewen Lu, Ling Pan, Doina Precup, Emmanuel Bengio
- MiniMol: A Parameter-Efficient Foundation Model for Molecular Learning : Kerstin Kläser, Błażej Banaszewski, Samuel Maddrell-Mander, Callum McLean, Luis Müller, Ali Parviz, Shenyang Huang, Andrew Fitzgibbon
- iWISDM: Assessing instruction following in multimodal models at scale : Xiaoxuan Lei, Lucas Gomez, Hao Yuan Bai, Pouya Bashivan
- Many-Shot In-Context Learning : Rishabh Agarwal, Avi Singh, Lei M. Zhang, Bernd Bohnet, Stephanie Chan, Ankesh Anand, Zaheer Abbas, Azade Nova, John D. Co-Reyes, Eric Chu, Feryal M. P. Behbahani, Aleksandra Faust, Hugo Larochelle
- Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving : Aniket Rajiv Didolkar, Anirudh Goyal, Nan Rosemary Ke, Siyuan Guo, Michal Valko, Timothy Lillicrap, Danilo Rezende, Yoshua Bengio, Michael Curtis Mozer, Sanjeev Arora
 
 
 
 
 
 
