
From April 24 to April 28, 2025, dozens of Mila researchers will attend the Thirteenth International Conference on Learning Representations (ICLR 2025) in Singapore. This year, they will share 87 scientific papers at the main conference and dozens of papers at workshops, showcasing their groundbreaking artificial intelligence (AI) research to peers from all around the world.
Here is a list of papers accepted at ICLR 2025 that contain at least one Mila-affiliated author.
Main Conference
Papers | Authors | |
Pitfalls of Evidence-Based AI Policy | Stephen Casper, David Krueger, Dylan Hadfield-Menell | https://openreview.net/pdf?id=8nyIAanfST |
Advantage Alignment Algorithms | Juan Agustin Duque, Milad Aghajohari, Tim Cooijmans, razvan ciuca, Tianyu Zhang, Gauthier Gidel, Aaron Courville | https://openreview.net/pdf?id=QFO1asgas2 |
Solving Hidden Monotone Variational Inequalities with Surrogate Losses | Ryan D'Orazio, Danilo Vucetic, Zichu Liu, Junhyung Lyle Kim, Ioannis Mitliagkas, Gauthier Gidel | https://openreview.net/pdf?id=4ZX2a3OKEV |
Contractive Dynamical Imitation Policies for Efficient Out-of-Sample Recovery | Amin Abyaneh, Mahrokh Ghoddousi Boroujeni, Hsiu-Chin Lin, Giancarlo Ferrari-Trecate | https://openreview.net/pdf?id=lILEtkWOXD |
AdaFisher: Adaptive Second Order Optimization via Fisher Information | Damien MARTINS GOMES, Yanlei Zhang, Eugene Belilovsky, Guy Wolf, Mahdi S. Hosseini | https://openreview.net/pdf?id=puTxuiK2qO |
Expressivity of Neural Networks with Random Weights and Learned Biases | Ezekiel Williams, Alexandre Payeur, Avery Hee-Woon Ryoo, Thomas Jiralerspong, Matthew G Perich, Luca Mazzucato, Guillaume Lajoie | https://openreview.net/pdf?id=5xwx1Myosu |
Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning | Haque Ishfaq, Guangyuan Wang, Mohammad Sami Nur Islam, Doina Precup | https://openreview.net/pdf?id=FvQsk3la17 |
Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL | Ghada Sokar, Johan Samir Obando Ceron, Aaron Courville, Hugo Larochelle, Pablo Castro | https://openreview.net/pdf?id=8oCrlOaYcc |
Boosting Latent Diffusion with Perceptual Objectives | Tariq Berrada, Pietro Astolfi, Melissa Hall, Marton Havasi, Yohann Benchetrit, Adriana Romero, Karteek Alahari, Michal Drozdzal, Jakob Verbeek | https://openreview.net/pdf?id=y4DtzADzd1 |
The Pitfalls of Memorization: When Memorization Hurts Generalization | Reza Bayat, Mohammad Pezeshki, Elvis Dohmatob, David Lopez-Paz, Pascal Vincent | https://openreview.net/pdf?id=vVhZh9ZpIM |
Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces | DiJia Su, Sainbayar Sukhbaatar, Michael Rabbat, Yuandong Tian, Qinqing Zheng | https://openreview.net/pdf?id=bmbRCRiNDu |
Training Language Models to Self-Correct via Reinforcement Learning | Aviral Kumar, Vincent Zhuang, Rishabh Agarwal, Yi Su, John D Co-Reyes, Avi Singh, Kate Baumli, Shariq Iqbal, Colton Bishop, Rebecca Roelofs, Lei M Zhang, Kay McKinney, Disha Shrivastava, Cosmin Paduraru, George Tucker, Doina Precup, Feryal Behbahani, Aleksandra Faust | https://openreview.net/pdf?id=CjwERcAU7w |
PETRA: Parallel End-to-end Training with Reversible Architectures | Stephane Rivaud, Louis Fournier, Thomas Pumir, Eugene Belilovsky, Michael Eickenberg, Edouard Oyallon | https://openreview.net/pdf?id=0fhzSFsGUT |
The Journey Matters: Average Parameter Count over Pre-training Unifies Sparse and Dense Scaling Laws | Tian Jin, Ahmed Imtiaz Humayun, Utku Evci, Suvinay Subramanian, Amir Yazdanbakhsh, Dan Alistarh, Gintare Karolina Dziugaite | https://openreview.net/pdf?id=ud8FtE1N4N |
SymmCD: Symmetry-Preserving Crystal Generation with Diffusion Models | Daniel Levy, Siba Smarak Panigrahi, Sékou-Oumar Kaba, Qiang Zhu, Kin Long Kelvin Lee, Mikhail Galkin, Santiago Miret, Siamak Ravanbakhsh | https://openreview.net/pdf?id=V7x2KZQn2v |
Input Space Mode Connectivity in Deep Neural Networks | Jakub Vrabel, Ori Shem-Ur, Yaron Oz, David Krueger | https://openreview.net/pdf?id=3qeOy7HwUT |
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation | Lu Li, Tianyu Zhang, Zhiqi Bu, Suyuchen Wang, Huan He, Jie Fu, Yonghui Wu, Jiang Bian, Yong Chen, Yoshua Bengio | https://openreview.net/pdf?id=1v7SRWsYve |
Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold | Lazar Atanackovic, Xi Zhang, Brandon Amos, Mathieu Blanchette, Leo J Lee, Yoshua Bengio, Alexander Tong, Kirill Neklyudov | https://openreview.net/pdf?id=9SYczU3Qgm |
HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models | Seanie Lee, Haebin Seong, Dong Bok Lee, Minki Kang, Xiaoyin Chen, Dominik Wagner, Yoshua Bengio, Juho Lee, Sung Ju Hwang | https://openreview.net/pdf?id=y3zswp3gek |
Action abstractions for amortized sampling | Oussama Boussif, Lena Nehale Ezzine, Joseph D Viviano, Michał Koziarski, Moksh J. Jain, Nikolay Malkin, Emmanuel Bengio, Rim Assouel, Yoshua Bengio | https://openreview.net/pdf?id=ispjankYab |
Towards Interpreting Visual Information Processing in Vision-Language Models | Clement Neo, Luke Ong, Philip Torr, Mor Geva, David Krueger, Fazl Barez | https://openreview.net/pdf?id=chanJGoa7f |
Neuroplastic Expansion in Deep Reinforcement Learning | Jiashun Liu, Johan Samir Obando Ceron, Aaron Courville, Ling Pan | https://openreview.net/pdf?id=20qZK2T7fa |
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models | Michael Noukhovitch, Shengyi Huang, Sophie Xhonneux, Arian Hosseini, Rishabh Agarwal, Aaron Courville | https://openreview.net/pdf?id=FhTAG591Ve |
Multi-agent cooperation through learning-aware policy gradients | Alexander Meulemans, Seijin Kobayashi, Johannes Von Oswald, Nino Scherrer, Eric Elmoznino, Blake Aaron Richards, Guillaume Lajoie, Blaise Aguera y Arcas, João Sacramento | https://openreview.net/pdf?id=GkWA6NjePN |
Seq-VCR: Preventing Collapse in Intermediate Transformer Representations for Enhanced Reasoning | Md Rifat Arefin, Gopeshh Subbaraj, Nicolas Gontier, Yann LeCun, Irina Rish, Ravid Shwartz-Ziv, Christopher Pal | https://openreview.net/pdf?id=30oIfmrcFO |
Accelerating Training with Neuron Interaction and Nowcasting Networks | Boris Knyazev, Abhinav Moudgil, Guillaume Lajoie, Eugene Belilovsky, Simon Lacoste-Julien | https://openreview.net/pdf?id=cUFIil6hEG |
MaestroMotif: Skill Design from Artificial Intelligence Feedback | Martin Klissarov, Mikael Henaff, Roberta Raileanu, Shagun Sodhani, Pascal Vincent, Amy Zhang, Pierre-Luc Bacon, Doina Precup, Marlos C. Machado, Pierluca D'Oro | https://openreview.net/pdf?id=or8mMhmyRV |
Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matching | Arnav Kumar Jain, Harley Wiltzer, Jesse Farebrother, Irina Rish, Glen Berseth, Sanjiban Choudhury | https://openreview.net/pdf?id=LvRQgsvd5V |
Enabling Realtime Reinforcement Learning at Scale with Staggered Asynchronous Inference | Matthew Riemer, Gopeshh Subbaraj, Glen Berseth, Irina Rish | https://openreview.net/pdf?id=fXb9BbuyAD |
Towards General-Purpose Model-Free Reinforcement Learning | Scott Fujimoto, Pierluca D'Oro, Amy Zhang, Yuandong Tian, Michael Rabbat | https://openreview.net/pdf?id=R1hIXdST22 |
Structure Language Models for Protein Conformation Generation | Jiarui Lu, Xiaoyin Chen, Stephen Zhewen Lu, Chence Shi, Hongyu Guo, Yoshua Bengio, Jian Tang | https://openreview.net/pdf?id=15AkNhFX1R |
Adaptive teachers for amortized samplers | Minsu Kim, Sanghyeok Choi, Taeyoung Yun, Emmanuel Bengio, Leo Feng, Jarrid Rector-Brooks, Sungsoo Ahn, Jinkyoo Park, Nikolay Malkin, Yoshua Bengio | https://openreview.net/pdf?id=BdmVgLMvaf |
ParetoFlow: Guided Flows in Multi-Objective Optimization | Ye Yuan, Can Chen, Christopher Pal, Xue Liu | https://openreview.net/pdf?id=mLyyB4le5u |
Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets | Zhen Liu, Tim Z. Xiao, Weiyang Liu, Yoshua Bengio, Dinghuai Zhang | https://openreview.net/pdf?id=Aye5wL6TCn |
PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation | Pablo Lemos, Sammy Nasser Sharief, Nikolay Malkin, Salma Salhi, Connor Stone, Laurence Perreault-Levasseur, Yashar Hezaveh | https://openreview.net/pdf?id=n7qGCmluZr |
Fully-inductive Node Classification on Arbitrary Graphs | Jianan Zhao, Zhaocheng Zhu, Mikhail Galkin, Hesham Mostafa, Michael M. Bronstein, Jian Tang | https://openreview.net/pdf?id=1Qpt43cqhg |
Learning diverse attacks on large language models for robust red-teaming and safety tuning | Seanie Lee, Minsu Kim, Lynn Cherif, David Dobre, Juho Lee, Sung Ju Hwang, Kenji Kawaguchi, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Moksh J. Jain | https://openreview.net/pdf?id=1mXufFuv95 |
InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation | Gaurav Sahu, Abhay Puri, Juan A. Rodriguez, Amirhossein Abaskohi, Mohammad Chegini, Alexandre Drouin, Perouz Taslakian, Valentina Zantedeschi, Alexandre Lacoste, David Vazquez, Nicolas Chapados, Christopher Pal, Sai Rajeswar, Issam Hadj Laradji | https://openreview.net/pdf?id=ZGqd0cbBvm |
MatExpert: Decomposing Materials Discovery By Mimicking Human Experts | Qianggang Ding, Santiago Miret, Bang Liu | https://openreview.net/pdf?id=AUBvo4sxVL |
On the Modeling Capabilities of Large Language Models for Sequential Decision Making | Martin Klissarov, R Devon Hjelm, Alexander T Toshev, Bogdan Mazoure | https://openreview.net/pdf?id=vodsIF3o7N |
Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo | João Loula, Benjamin LeBrun, Li Du, Ben Lipkin, Clemente Pasti, Gabriel Grand, Tianyu Liu, Yahya Emara, Marjorie Freedman, Jason Eisner, Ryan Cotterell, Vikash Mansinghka, Alexander K. Lew, Tim Vieira, Timothy J. O'Donnell | https://openreview.net/pdf?id=xoXn62FzD0 |
Proving Olympiad Inequalities by Synergizing LLMs and Symbolic Reasoning | Zenan Li, Zhaoyu Li, Wen Tang, Xian Zhang, Yuan Yao, Xujie Si, Fan Yang, Kaiyu Yang, Xiaoxing Ma | https://openreview.net/pdf?id=FiyS0ecSm0 |
BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks | Juan A. Rodriguez, Xiangru Jian, Siba Smarak Panigrahi, Tianyu Zhang, Aarash Feizi, Abhay Puri, Akshay Kalkunte Suresh, François Savard, Ahmed Masry, Shravan Nayak, Rabiul Awal, Mahsa Massoud, Amirhossein Abaskohi, Zichao Li, Suyuchen Wang, Pierre-Andre Noel, Mats Leon Richter, Saverio Vadacchino, Shubham Agarwal, Sanket Biswas, Sara Shanian, Ying Zhang, Sathwik Tejaswi Madhusudhan, Joao Monteiro, Krishnamurthy Dj Dvijotham, Torsten Scholak, Nicolas Chapados, Sepideh Kharaghani, Sean Hughes, M. Özsu, Siva Reddy, Marco Pedersoli, Yoshua Bengio, Christopher Pal, Issam Hadj Laradji, Spandana Gella, Perouz Taslakian, David Vazquez, Sai Rajeswar | https://openreview.net/pdf?id=b1ivBPLb1n |
Studying the Interplay Between the Actor and Critic Representations in Reinforcement Learning | Samuel Garcin, Trevor McInroe, Pablo Castro, Christopher G. Lucas, David Abel, Prakash Panangaden, Stefano V Albrecht | https://openreview.net/pdf?id=tErHYBGlWc |
What Secrets Do Your Manifolds Hold? Understanding the Local Geometry of Generative Models | Ahmed Imtiaz Humayun, Ibtihel Amara, Cristina Nader Vasconcelos, Deepak Ramachandran, Candice Schumann, Junfeng He, Katherine A Heller, Golnoosh Farnadi, Negar Rostamzadeh, Mohammad Havaei | https://openreview.net/pdf?id=etif9j1CnG |
On the Transfer of Object-Centric Representation Learning | Aniket Rajiv Didolkar, Andrii Zadaianchuk, Anirudh Goyal, Michael Curtis Mozer, Yoshua Bengio, Georg Martius, Maximilian Seitzer | https://openreview.net/pdf?id=bSq0XGS3kW |
Forgetting Transformer: Softmax Attention with a Forget Gate | Zhixuan Lin, Evgenii Nikishin, Xu He, Aaron Courville | https://openreview.net/pdf?id=q2Lnyegkr8 |
Influence Functions for Scalable Data Attribution in Diffusion Models | Bruno Mlodozeniec, Runa Eschenhagen, Juhan Bae, Alexander Immer, David Krueger, Richard E. Turner | https://openreview.net/pdf?id=esYrEndGsr |
GlycanML: A Multi-Task and Multi-Structure Benchmark for Glycan Machine Learning | Minghao Xu, Yunteng Geng, Yihang Zhang, Ling Yang, Jian Tang, Wentao Zhang | https://openreview.net/pdf?id=owEQ0FTfVj |
Accelerating Inference of Retrieval-Augmented Generation via Sparse Context Selection | Yun Zhu, Jia-Chen Gu, Caitlin Sikora, Ho Ko, Yinxiao Liu, Chu-Cheng Lin, Lei Shu, Liangchen Luo, Lei Meng, Bang Liu, Jindong Chen | https://openreview.net/pdf?id=HE6pJoNnFp |
Towards Improving Exploration through Sibling Augmented GFlowNets | Kanika Madan, Alex Lamb, Emmanuel Bengio, Glen Berseth, Yoshua Bengio | https://openreview.net/pdf?id=HH4KWP8RP5 |
Protecting against simultaneous data poisoning attacks | Neel Alex, Shoaib Ahmed Siddiqui, Amartya Sanyal, David Krueger | https://openreview.net/pdf?id=rK0YJwL69S |
AssembleFlow: Rigid Flow Matching with Inertial Frames for Molecular Assembly | Hongyu Guo, Yoshua Bengio, Shengchao Liu | https://openreview.net/pdf?id=jckKNzYYA6 |
VCR: Pixel-Level Complex Reasoning by Restoring Occluded Text | Tianyu Zhang, Suyuchen Wang, Lu Li, Ge Zhang, Perouz Taslakian, Sai Rajeswar, Jie Fu, Bang Liu, Yoshua Bengio | https://openreview.net/pdf?id=s0Z4csHOoE |
CarbonSense: A Multimodal Dataset and Baseline for Carbon Flux Modelling | Matthew Fortier, Mats Leon Richter, Oliver Sonnentag, Christopher Pal | https://openreview.net/pdf?id=l8zRnvD95l |
Handling Delay in Real-Time Reinforcement Learning | Ivan Anokhin, Rishav, Matthew Riemer, Stephen Chung, Irina Rish, Samira Ebrahimi Kahou | https://openreview.net/pdf?id=YOc5t8PHf2 |
Beyond FVD: An Enhanced Evaluation Metrics for Video Generation Distribution Quality | Ge Ya Luo, Gian Mario Favero, Zhi Hao Luo, Alexia Jolicoeur-Martineau, Christopher Pal | https://openreview.net/pdf?id=cC3LxGZasH |
Multi-session, multi-task neural decoding from distinct cell-types and brain regions | Mehdi Azabou, Krystal Xuejing Pan, Vinam Arora, Ian Jarratt Knight, Eva L Dyer, Blake Aaron Richards | https://openreview.net/pdf?id=IuU0wcO0mo |
Credit-based self organizing maps: training deep topographic networks with minimal performance degradation | Amir Ozhan Dehghani, Xinyu Qian, Asa Farahani, Pouya Bashivan | https://openreview.net/pdf?id=wMgr7wBuUo |
Accelerating neural network training: An analysis of the AlgoPerf competition | Priya Kasimbeg, Frank Schneider, Runa Eschenhagen, Juhan Bae, Chandramouli Shama Sastry, Mark Saroufim, BOYUAN FENG, Less Wright, Edward Z. Yang, Zachary Nado, Sourabh Medapati, Philipp Hennig, Michael Rabbat, George E. Dahl | https://openreview.net/pdf?id=CtM5xjRSfm |
Interpreting Emergent Planning in Model-Free Reinforcement Learning | Thomas Bush, Stephen Chung, Usman Anwar, Adrià Garriga-Alonso, David Krueger | https://openreview.net/pdf?id=DzGe40glxs |
A Generalist Hanabi Agent | Arjun V Sudhakar, Hadi Nekoei, Mathieu Reymond, Miao Liu, Janarthanan Rajendran, Sarath Chandar | https://openreview.net/pdf?id=pCj2sLNoJq |
Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth Study | Shawn Tan, Songlin Yang, Aaron Courville, Rameswar Panda, Yikang Shen | https://openreview.net/pdf?id=r8J3DSD5kF |
Selective Unlearning via Representation Erasure Using Domain Adversarial Training | Nazanin Mohammadi Sepahvand, Eleni Triantafillou, Hugo Larochelle, Doina Precup, James J. Clark, Daniel M. Roy, Gintare Karolina Dziugaite | https://openreview.net/pdf?id=KzSGJy1PIf |
OSCAR: Operating System Control via State-Aware Reasoning and Re-Planning | Xiaoqiang Wang, Bang Liu | https://openreview.net/pdf?id=VuTrZzrPfn |
AFlow: Automating Agentic Workflow Generation | Jiayi Zhang, Jinyu Xiang, Zhaoyang Yu, Fengwei Teng, Xiong-Hui Chen, Jiaqi Chen, Mingchen Zhuge, Xin Cheng, Sirui Hong, Jinlin Wang, Bingnan Zheng, Bang Liu, Yuyu Luo, Chenglin Wu | https://openreview.net/pdf?id=z5uVAKwmjf |
MMTEB: Massive Multilingual Text Embedding Benchmark | Kenneth Enevoldsen, Isaac Chung, Imene Kerboua, Márton Kardos, Ashwin Mathur, David Stap, Jay Gala, Wissam Siblini, Dominik Krzemiński, Genta Indra Winata, Saba Sturua, Saiteja Utpala, Mathieu Ciancone, Marion Schaeffer, Diganta Misra, Shreeya Dhakal, Jonathan Rystrøm, Roman Solomatin, Ömer Veysel Çağatan, Akash Kundu, Martin Bernstorff, Shitao Xiao, Akshita Sukhlecha, Bhavish Pahwa, Rafał Poświata, Kranthi Kiran GV, Shawon Ashraf, Daniel Auras, Björn Plüster, Jan Philipp Harries, Loïc Magne, Isabelle Mohr, Dawei Zhu, Hippolyte Gisserot-Boukhlef, Tom Aarsen, Jan Kostkan, Konrad Wojtasik, Taemin Lee, Marek Suppa, Crystina Zhang, Roberta Rocca, Mohammed Hamdy, Andrianos Michail, John Yang, Manuel Faysse, Aleksei Vatolin, Nandan Thakur, Manan Dey, Dipam Vasani, Pranjal A Chitale, Simone Tedeschi, Nguyen Tai, Artem Snegirev, Mariya Hendriksen, Michael Günther, Mengzhou Xia, Weijia Shi, Xing Han Lu, Jordan Clive, Gayatri K, Maksimova Anna, Silvan Wehrli, Maria Tikhonova, Henil Shalin Panchal, Aleksandr Abramov, Malte Ostendorff, Zheng Liu, Simon Clematide, Lester James Validad Miranda, Alena Fenogenova, Guangyu Song, Ruqiya Bin Safi, Wen-Ding Li, Alessia Borghini, Federico Cassano, Lasse Hansen, Sara Hooker, Chenghao Xiao, Vaibhav Adlakha, Orion Weller, Siva Reddy, Niklas Muennighoff | https://openreview.net/pdf?id=zl3pfz4VCV |
Safety Representations for Safer Policy Learning | Kaustubh Mani, Vincent Mai, Charlie Gauthier, Annie S Chen, Samer B. Nashed, Liam Paull | https://openreview.net/pdf?id=gJG4IPwg6l |
Mastering Task Arithmetic: τJp as a Key Indicator for Weight Disentanglement | Kotaro Yoshida, Yuji Naraki, Takafumi Horie, Ryosuke Yamaki, Ryotaro Shimizu, Yuki Saito, Julian McAuley, Hiroki Naganuma | https://openreview.net/pdf?id=1VwWi6zbxs |
3DMolFormer: A Dual-channel Framework for Structure-based Drug Discovery | Xiuyuan Hu, Guoqing Liu, Can Chen, Yang Zhao, Hao Zhang, Xue Liu | https://openreview.net/pdf?id=RgE1qiO2ek |
An Auditing Test to Detect Behavioral Shift in Language Models | Leo Richter, Xuanli He, Pasquale Minervini, Matt J. Kusner | |
Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models | Andrea Tirinzoni, Ahmed Touati, Jesse Farebrother, Mateusz Guzek, Anssi Kanervisto, Yingchen Xu, Alessandro Lazaric, Matteo Pirotta | |
AIMS.au: A Dataset for the Analysis of Modern Slavery Countermeasures in Corporate Statements | Adriana Eufrosina Bora, Pierre-Luc St-Charles, Mirko Bronzi, Arsene Fansi Tchango, Bruno Rousseau, Kerrie Mengersen | |
Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction | Jarrid Rector-Brooks, Mohsin Hasan, Zhangzhi Peng, Zachary Quinn, Chenghao Liu, Sarthak Mittal, Nouha Dziri, Michael Bronstein, Yoshua Bengio, Pranam Chatterjee, Alexander Tong, Avishek Joey Bose | https://openreview.net/pdf?id=Ombm8S40zN |
MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL | Claas A Voelcker, Marcel Hussing, Eric Eaton, Amir-massoud Farahmand, Igor Gilitschenski | |
A Truncated Newton Method for Optimal Transport | Mete Kemertas, Amir-massoud Farahmand, Allan Douglas Jepson | |
MuPT: A Generative Symbolic Music Pretrained Transformer | Xingwei Qu, yuelin bai, Yinghao Ma, Ziya Zhou, Ka Man Lo, Jiaheng Liu, Ruibin Yuan, Lejun Min, Xueling Liu, Tianyu Zhang, Xeron Du, Shuyue Guo, Yiming Liang, Yizhi LI, Shangda Wu, Junting Zhou, Tianyu Zheng, Ziyang Ma, Fengze Han, Wei Xue, Gus Xia, Emmanouil Benetos, Xiang Yue, Chenghua Lin, Xu Tan, Wenhao Huang, Jie Fu, Ge Zhang | |
Spectra: Surprising Effectiveness of Pretraining Ternary Language Models at Scale | Ayush Kaushal, Tejas Vaidhya, Arnab Kumar Mondal, Tejas Pandey, Aaryan Bhagat, Irina Rish | |
INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge | Angelika Romanou, Negar Foroutan, Anna Sotnikova, Sree Harsha Nelaturu, Shivalika Singh, Rishabh Maheshwary, Micol Altomare, Zeming Chen, Mohamed A. Haggag, Snegha A, Alfonso Amayuelas, Azril Hafizi Amirudin, Danylo Boiko, Michael Chang, Jenny Chim, Gal Cohen, Aditya Kumar Dalmia, Abraham Diress, Sharad Duwal, Daniil Dzenhaliou, Daniel Fernando Erazo Florez, Fabian Farestam, Joseph Marvin Imperial, Shayekh Bin Islam, Perttu Isotalo, Maral Jabbarishiviari, Börje F. Karlsson, Eldar Khalilov, Christopher Klamm, Fajri Koto, Dominik Krzemiński, Gabriel Adriano de Melo, Syrielle Montariol, Yiyang Nan, Joel Niklaus, Jekaterina Novikova, Johan Samir Obando Ceron, Debjit Paul, Esther Ploeger, Jebish Purbey, Swati Rajwal, Selvan Sunitha Ravi, Sara Rydell, Roshan Santhosh, Drishti Sharma, Marjana Prifti Skenduli, Arshia Soltani Moakhar, Bardia soltani moakhar, Ayush Kumar Tarun, Azmine Toushik Wasi, Thenuka Ovin Weerasinghe, Serhan Yilmaz, Mike Zhang, Imanol Schlag, Marzieh Fadaee, Sara Hooker, Antoine Bosselut ( | |
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References | Qiyuan Zhang, Yufei Wang, Tiezheng YU, Yuxin Jiang, Chuhan Wu, Liangyou Li, Yasheng Wang, Xin Jiang, Lifeng Shang, Ruiming Tang, Fuyuan Lyu, Chen Ma | |
Let Your Features Tell The Differences: Understanding Graph Convolution By Feature Splitting | Yilun Zheng, Xiang Li, Sitao Luan, Xiaojiang Peng, Lihui Chen | https://openreview.net/pdf?id=I9omfcWfMp |
ZETA: Leveraging Z-order Curves for Efficient Top-K Attention | Qiuhuao Zeng, Jerry Huang, Peng Lu, Gezheng Xu, Boxing Chen, Charles Ling, Boyu Wang | |
Improving Equivariant Networks with Probabilistic Symmetry Breaking | Hannah Lawrence, Vasco Portilheiro, Yan Zhang, Sékou-Oumar Kaba | |
Bridging the Data Provenance Gap Across Text, Speech, and Video | Shayne Longpre, Nikhil Singh, Manuel Cherep, Kushagra Tiwary, Joanna Materzynska, William Brannon, Robert Mahari, Naana Obeng-Marnu, Manan Dey, Mohammed Hamdy, Nayan Saxena, Ahmad Mustafa Anis, Emad A. Alghamdi, Vu Minh Chien, Da Yin, Kun Qian, Yizhi LI, Minnie Liang, An Dinh, Shrestha Mohanty, Deividas Mataciunas, Tobin South, Jianguo Zhang, Ariel N. Lee, Campbell S. Lund, Christopher Klamm, Damien Sileo, Diganta Misra, Enrico Shippole, Kevin Klyman, Lester James Validad Miranda, Niklas Muennighoff, Seonghyeon Ye, Seungone Kim, Vipul Gupta, Vivek Sharma, Xuhui Zhou, Caiming Xiong, Luis Villa, Stella Biderman, Alex Pentland, Sara Hooker, Jad Kabbara | |
Multi-Modal and Multi-Attribute Generation of Single Cells with CFGen | Alessandro Palma, Till Richter, Hanyi Zhang, Manuel Lubetzki, Alexander Tong, Andrea Dittadi, Fabian J Theis | |
The Superposition of Diffusion Models Using the Itô Density Estimator | Marta Skreta, Lazar Atanackovic, Joey Bose, Alexander Tong, Kirill Neklyudov | |
Efficient Evolutionary Search Over Chemical Space with Large Language Models | Haorui Wang, Marta Skreta, Cher-Tian Ser, Wenhao Gao, Lingkai Kong, Felix Strieth-Kalthoff, Chenru Duan, Yuchen Zhuang, Yue Yu, Yanqiao Zhu, Yuanqi Du, Alán Aspuru-Guzik, Kirill Neklyudov, Chao Zhang |
Workshops
Paper | Authors | |
Performative Prediction on Games and Mechanism Design | António Góis, Mehrnaz Mofakhami, Fernando P. Santos, Simon Lacoste-Julien, Gauthier Gidel | |
Preference Optimization for Concept Bottleneck Models | Emiliano Penaloza, Tianyue H. Zhang, Laurent Charlin, Mateo Espinosa Zarlenga | https://openreview.net/pdf?id=Bz92EvEeD1 |
Design Editing for Offline Model-based Optimization | Ye Yuan, Youyuan Zhang, Can Chen, Haolun Wu, Melody Zixuan Li, Jianmo Li, James J. Clark, Xue Liu | |
Mitigating Shortcut Learning with Diffusion Counterfactuals and Diverse Ensembles | Luca Scimeca, Alexander Rubinstein, Damien Teney, Seong Joon Oh, Yoshua Bengio | https://openreview.net/pdf?id=fF1KXgAhKN |
Solving Bayesian inverse problems with diffusion priors and off-policy RL | Luca Scimeca, Siddarth Venkatraman, Moksh Jain, Minsu Kim, Marcin Sendera, Mohsin Hasan, Alexandre Adam, Yashar Hezaveh, Laurence Perreault-Levasseur, Yoshua Bengio, Glen Berseth, Nikolay Malkin | |
Outsourced diffusion sampling: Efficient posterior inference in latent spaces of generative models | Siddarth Venkatraman, Mohsin Hasan, Minsu Kim, Luca Scimeca, Marcin Sendera, Yoshua Bengio, Glen Berseth, Nikolay Malkin | |
Shaping Inductive Bias in Diffusion Models through Frequency-Based Noise Control | Thomas Jiralerspong, Berton Earnshaw, Jason Hartford, Yoshua Bengio, Luca Scimeca | |
Societal Alignment Frameworks Can Improve LLM Alignment | Karolina Stańczak, Nicholas Meade, Mehar Bhatia, Hattie Zhou, Konstantin Böttinger, Jeremy Barnes, Jason Stanley, Jessica Montgomery, Richard Zemel, Nicolas Papernot, Nicolas Chapados, Denis Therien, Timothy P. Lillicrap, Ana Marasović, Sylvie Delacroix, Gillian K. Hadfield, Siva Reddy | |
AffinityFlow: Guided Flows for Antibody Affinity Maturation | Can Chen, Karla-Luise Herpoldt, Chenchao Zhao, Zichen Wang, Marcus Collins, Shang Shang, Ron Benson | https://arxiv.org/pdf/2503.00069 |
Temporal Difference Flows | Jesse Farebrother, Matteo Pirotta, Andrea Tirinzoni, Remi Munos, Alessandro Lazaric, Ahmed Touati | |
CrystalGym: A New Benchmark for Materials Discovery Using Reinforcement Learning | Prashant Govindarajan, Mathieu Reymond, Antoine Clavaud, Mariano Phielipp, Santiago Miret, Sarath Chandar | |
Curly Flow Matching for Learning Non-Gradient Field Dynamics | Katarina Petrovic, Lazar Atanackovic, Kacper Kapusniak, Michael Bronstein, Avishek Joey Bose, Alexander Tong | |
Scalable Equilibrium Sampling with Sequential Boltzmann Generators | Charlie Tan, Avishek Joey Bose, Chen Lin, Leon Klein, Michael Bronstein, Alexander Tong | |
Timing is important: Risk-aware Fund Allocation based on Time-Series Forecasting | Fuyuan Lyu, Linfeng Du, Yunpeng Weng, Qiufang Ying, Zhiyan Xu, wenzou, Haolun Wu, xiuqiang He, Xing Tang | |
Exploring Sparse Adapters for Scalable Merging of Parameter Efficient Experts | Samin Yeasar Arnob, Zhan Su, Minseon Kim, Oleksiy Ostapenko, Doina Precup, Lucas Caccia, Alessandro Sordoni | |
A Joint Space-Time Encoder for Geographic Time-Series Data | David Mickisch, Konstantin Klemmer, Mélisande Teng, David Rolnick | |
Alberta Wells Dataset: Pinpointing Oil and Gas Wells from Satellite Imagery | Pratinav Seth, Michelle Lin, Brefo Dwamena Yaw, Jade Boutot, Mary Kang, David Rolnick | |
Assessing SAM for tree crown instance segmentation from drone imagery | Mélisande Teng, Arthur Ouaknine, Etienne Laliberté, Yoshua Bengio, David Rolnick, Hugo Larochelle | |
Physics-based data-driven model for CO2 gas diffusion electrodes to drive automated laboratories | Ivan Grega, Félix Therrien, Abhishek Soni, Karry Ocean, Kevan Dettelbach, Ribwar Ahmadi, Mehrdad Mokhtari, Curtis P. Berlinguette, Yoshua Bengio | https://openreview.net/pdf?id=B73xYizsLV |
ON THE ROLE OF PROMPT MULTIPLICITY IN LLM HALLUCINATION EVALUATION | ||
Hyper-Align: Efficient Modality Alignment via Hypernetworks | Jaisidh Singh, Diganta Misra, Boris Knyazev, Antonio Orvieto | |
Feynman-Kac Correctors in Diffusion: Annealing, Guidance, and Product of Experts | Marta Skreta, Tara Akhound-Sadegh, Viktor Ohanesian, Roberto Bondesan, Alan Aspuru-Guzik, Arnaud Doucet, Rob Brekelmans, Alexander Tong, Kirill Neklyudov | |
Path Planning for Masked Diffusion Models with Applications to Biological Sequence Generation | Fred Zhangzhi Peng, Zachary Bezemek, Sawan Patel, Jarrid Rector-Brooks, Sherwood Yao, Alexander Tong, Pranam Chatterjee | |
SOAPI: Siamese-guided generation of Off-Target-Avoiding Protein Interactions | Sophia Vincoff, Oscar Davis, Alexander Tong, Joey Bose, Pranam Chatterjee | |
Gumbel-Softmax Score and Flow Matching for Discrete Biological Sequence Generation | Sophia Tang, Yinuo Zhang, Alexander Tong, Pranam Chatterjee | |
Simulation-Free Structure Learning For Stochastic Dynamics | Adam Stecklov, Noah El Rimawi-Fine, Lucas Nelson, Stephen Y. Zhang, Lazar Atanackovic, Alexander Tong, Mathieu Blanchette | |
AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding | Ahmed Masry, Juan A. Rodriguez, Tianyu Zhang, Suyuchen Wang, Chao Wang, Aarash Feizi, Akshay Kalkunte Suresh, Abhay Puri, Xiangru Jian, Pierre-André Noël, Sathwik Tejaswi Madhusudhan, Marco Pedersoli, Bang Liu, Nicolas Chapados, Yoshua Bengio, Enamul Hoque, Christopher Pal, Issam H. Laradji, David Vazquez, Perouz Taslakian, Spandana Gella, Sai Rajeswar | |
WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation | Rabiul Awal, Mahsa Massoud, Zichao Li, Aarash Feizi, Suyuchen Wang, Christopher Pal, Aishwarya Agrawal, David Vazquez, Siva Reddy, Juan A. Rodriguez, Perouz Taslakian, Spandana Gella, Sai Rajeswar | |
ASYNC-TB: Scaling Off-Policy Exploration for LLM Reinforcement Learning | Brian R. Bartoldson, Siddarth Venkatraman, James Diffenderfer, Moksh Jain, Tal Ben-Nun, Seanie Lee, Minsu Kim, Johan Obando-Ceron, Yoshua Bengio, Bhavya Kailkhura | https://openreview.net/pdf?id=iSyxl2dKyz |
UNLEARNING GEO-CULTURAL STEREOTYPES IN MULTILINGUAL LLMS | Alireza Dehghanpour Farashah, Aditi Khandelwal, Negar Rostamzadeh, Golnoosh Farnadi | |
Generative Verifiers: Reward Modeling as Next Token Prediction | ||
DASFormer: Self-supervised Pretraining for Earthquake Monitoring | Qianggang Ding, Zhichao Shen, Weiqiang Zhu, Bang Liu | https://openreview.net/forum?id=LnWM7aVaFE |
TradExpert: Revolutionizing Trading with Mixture of Expert LLMs | Qianggang Ding, Haochen Shi, Jiadong Guo, Bang Liu | |
ICLR 2025 Workshop on Tackling Climate Change with Machine Learning: Data-Centric Approaches in ML for Climate Action | Konstantin Klemmer, Melissa Chapman, Lily Xu, Poon Kin Ho, Mélisande Teng, Patrick Emami, Yoshua Bengio | |
Integrating Generative and Experimental Platforms for Biomolecular Design | Chenghao Liu, Jarrid Rector-Brooks, Soojung Yang, Sidney Lisanza, Francesca-Zhoufan Li, Hannes Stärk, Jacob Gershon, Lauren Hong, Pranam Chatterjee, Tommi Jaakkola, Regina Barzilay , David Baker , Frances Arnold , Yoshua Bengio |