Publications

TypyBench: Evaluating LLM Type Inference for Untyped Python Repositories

Honghua Dong

Jiacheng Yang

Xun Deng

Yuhe Jiang

Gennady Pekhimenko

Fan Long

Xujie Si

2025-04-30

ICML.cc/2025/Conference (poster)

proceedings.mlr.press

Caffeine induces age-dependent increases in brain complexity and criticality during sleep

Philipp Thölke

Maxine Arcand-Lavigne

Tarek Lajnef

Sonia Frenette

Julie Carrier

Karim Jerbi

Caffeine is the most widely consumed psychoactive stimulant worldwide. Yet important gaps persist in understanding its effects on the brain,… (see more) especially during sleep. We analyzed sleep electroencephalography (EEG) in 40 subjects, contrasting 200 mg of caffeine against a placebo condition, utilizing inferential statistics and machine learning. We found that caffeine ingestion led to an increase in brain complexity, a widespread flattening of the power spectrum’s 1/f-like slope, and a reduction in long-range temporal correlations. Being most prominent during non-rapid eye movement (NREM) sleep, these results suggest that caffeine shifts the brain towards a critical regime and more diverse neural dynamics. Interestingly, this was more pronounced in younger adults (20–27 years) compared to middle-aged participants (41–58 years) during rapid eye movement (REM) sleep, while no significant age effects were observed during NREM. Interpreting these data in the light of modeling and empirical work on EEG-derived measures of excitation-inhibition balance suggests that caffeine promotes a shift in brain dynamics towards increased neural excitation and closer proximity to a critical regime, particularly during NREM sleep.

2025-04-29

Communications Biology (published)

doi.org

JPerfEvo: A Tool for Tracking Method-Level Performance Changes in Java Projects

Kaveh Shahedi

Maxime Lamothe

Foutse Khomh

Heng Li

Performance regressions and improvements are common phenomena in software development, occurring periodically as software evolves and mature… (see more)s. When developers introduce new changes to a program’s codebase, unforeseen performance variations may arise. Identifying these changes at the method level, however, can be challenging due to the complexity and scale of modern codebases. In this work, we present JPerfEvo, a tool designed to automate the evaluation of the method-level performance impact of each code commit (i.e., the performance variations between the two versions before and after a commit). Leveraging the Java Microbenchmark Harness (JMH) module for benchmarking the modified methods, JPerfEvo instruments their execution and applies robust statistical evaluations to detect performance changes. The tool can classify these changes as performance improvements, regressions, or neutral (i.e., no change), with the change magnitude. We evaluated JPerfEvo on three popular and mature open-source Java projects, demonstrating its effectiveness in identifying performance changes throughout their development histories.

2025-04-27

IEEE Working Conference on Mining Software Repositories (published)

doi.org

Performance Smells in ML and Non-ML Python Projects: A Comparative Study

Franccois Belias

Leuson Da Silva

Foutse Khomh

Cyrine Zid

2025-04-27

ArXiv (preprint)

doi.org

arxiv.org

Proceedings of 1st Workshop on Advancing Artificial Intelligence through Theory of Mind

Mouad Abrini

Omri Abend

Dina M. Acklin

Henny Admoni

Gregor Aichinger

Nitay Alon

Zahra Ashktorab

Ashish Atreja

Moises Auron

Alexander Aufreiter

Raghav Awasthi

Soumya Banerjee

Joseph Barnby

Rhea Basappa

Severin Bergsmann

Djallel Bouneffouf

Patrick Callaghan

Marc Cavazza

Thierry Chaminade

Sonia Chernova … (see 88 more)

Mohamed Chetouan

Moumita Choudhury

Axel Cleeremans

J. Cywinski

Fabio Cuzzolin

Hokin Deng

N'yoma Diamond

C. D. Pasquasio

Guillaume Dumas

Max J. van Duijn

Mahapatra Dwarikanath

Qingying Gao

Ashok Goel

Rebecca R. Goldstein

Matthew C. Gombolay

Gabriel Enrique Gonzalez

Amar Halilovic

Tobias Halmdienst

Mahimul Islam

Julian Jara-Ettinger

Natalie Kastel

Renana Keydar

Ashish K. Khanna

Mahdi Khoramshahi

Jihyun Kim

Mihyeon Kim

Youngbin Kim

Senka Krivic

Nikita Krasnytskyi

Arun Kumar

Junehyoung Kwon

EunJu Lee

Shane Lee

Peter R. Lewis 0001

Xue Li

Yijiang Li

Michal Lewandowski

Nathan Lloyd

Matthew B. Luebbers

Dezhi Luo

Haiyun Lyu

Dwarikanath Mahapatra

Kamal Maheshwari

Mallika Mainali

P. Mathur

Patrick Mederitsch

Shuwa Miura

Manuel Preston de Miranda

Reuth Mirsky

Shreya Mishra

Nina M. Moorman

Katelyn Morrison

John Muchovej

Bernhard Nessler

Felix Nessler

Hieu Minh Jord Nguyen

Abby Ortego

F. Papay

Antoine Pasquali

Hamed Rahimi

C. Raghu

Amanda L. Royka

Stefan Sarkadi

Jaelle Scheuerman

Simon Schmid

Paul Schrater

Anik Sen

Zahra Sheikhbahaee

Ke Shi

Reid G. Simmons

Nishant Singh

Mason O. Smith

Ramira van der Meulen

Anthia Solaki

Haoran Sun

Viktor Szolga

Matthew E. Taylor

Travis Taylor

Sanne van Waveren

Juan David Vargas

R. Verbrugge

Eitan Wagner

Justin D. Weisz

Ximing Wen

William Yeoh

Wenlong Zhang

Michelle Zhao

Shlomo Zilberstein

2025-04-27

ArXiv (preprint)

doi.org

arxiv.org

Solving Combinatorial Pricing Problems using Embedded Dynamic Programming Models

Quang Minh Bui

Margarida Carvalho

José Neto

The combinatorial pricing problem (CPP) is a bilevel problem in which the leader maximizes their revenue by imposing tolls on certain items … (see more)that they can control. Based on the tolls set by the leader, the follower selects a subset of items corresponding to an optimal solution of a combinatorial optimization problem. To accomplish the leader's goal, the tolls need to be sufficiently low to discourage the follower from choosing the items offered by the competitors. In this paper, we derive a single-level reformulation for the CPP by rewriting the follower's problem as a longest path problem using a dynamic programming model, and then taking its dual and applying strong duality. We proceed to solve the reformulation in a dynamic fashion with a cutting plane method. We apply this methodology to 2 distinct dynamic programming models, namely, a novel formulation designated as selection diagram and the well-known decision diagram. We also produce numerical results to evaluate their performances across 3 different specializations of the CPP and a closely related problem that is the knapsack interdiction problem. Our results showcase the potential of the 2 proposed reformulations over the natural value function approach, expanding the set of tools to solve combinatorial bilevel programs.

2025-04-27

INFORMS Journal on Computing (published)

doi.org

arxiv.org

How Programmers Interact with Multimodal Software Documentation

Deeksha M. Arya

Jin L.C. Guo

Martin P. Robillard

There is a wide variety of online documentation to learn about a given software technology, and prior research has reported that programmers… (see more) must invest time and effort to identify one that best suits their need. We evaluated five modalities to present information that enable a software document to cater to the different presentation needs of programmers. We developed a prototype tutorial with these modalities on three topics in Java, namely, regular expressions, inheritance, and exception handling. We investigated how people interact with the modalities in the tutorial given a programming topic and a type of task. We conducted a survey study with 56 respondents and confirm that although text content is most useful for solving conceptual tasks, code examples support deeper comprehension of the underlying concepts. Furthermore, we report that respondents' contradicting preferences for the modalities suggest the need to have multiple alternatives in a software tutorial.

2025-04-26

2025 IEEE/ACM 18th International Conference on Cooperative and Human Aspects of Software Engineering (CHASE) (published)

doi.org

AIFM-ed Curriculum Framework for Postgraduate Family Medicine Education on Artificial Intelligence: Mixed Methods Study

Raymond Tolentino

Fanny Hersson-Edery

Mark Yaffe

Samira Abbasgholizadeh-Rahimi

As health care moves to a more digital environment, there is a growing need to train future family doctors on the clinical uses of artificia… (see more)l intelligence (AI). However, family medicine training in AI has often been inconsistent or lacking. The aim of the study is to develop a curriculum framework for family medicine postgraduate education on AI called “Artificial Intelligence Training in Postgraduate Family Medicine Education” (AIFM-ed). First, we conducted a comprehensive scoping review on existing AI education frameworks guided by the methodological framework developed by Arksey and O’Malley and Joanna Briggs Institute methodological framework for scoping reviews. We adhered to the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) checklist for reporting the results. Next, 2 national expert panels were conducted. Panelists included family medicine educators and residents knowledgeable in AI from family medicine residency programs across Canada. Participants were purposively sampled, and panels were held via Zoom, recorded, and transcribed. Data were analyzed using content analysis. We followed the Standards for Reporting Qualitative Research for panels. An integration of the scoping review results and 2 panel discussions of 14 participants led to the development of the AIFM-ed curriculum framework for AI training in postgraduate family medicine education with five key elements: (1) need and purpose of the curriculum, (2) learning objectives, (3) curriculum content, (4) organization of curriculum content, and (5) implementation aspects of the curriculum. Using the results of this study, we developed the AIFM-ed curriculum framework for AI training in postgraduate family medicine education. This framework serves as a structured guide for integrating AI competencies into medical education, ensuring that future family physicians are equipped with the necessary skills to use AI effectively in their clinical practice. Future research should focus on the validation and implementation of the AIFM-ed framework within family medicine education. Institutions also are encouraged to consider adapting the AIFM-ed framework within their own programs, tailoring it to meet the specific needs of their trainees and health care environments.

2025-04-24

JMIR Medical Education (published)

doi.org

RLeXplore: Accelerating Research in Intrinsically-Motivated Reinforcement Learning

Mingqi Yuan

Roger Creus Castanyer

Bin Li

Xin Jin

Glen Berseth

Wenjun Zeng

2025-04-23

TMLR (accepted)

doi.org

openreview.net

3DMolFormer: A Dual-Channel Framework for Structure-Based Drug Discovery

Xiuyuan Hu

Guoqing Liu

Can (Sam) Chen

Yang Zhao

Hao Zhang

Xue Liu

Structure-based drug discovery, encompassing the tasks of protein-ligand docking and pocket-aware 3D drug design, represents a core challeng… (see more)e in drug discovery. However, no existing work can deal with both tasks to effectively leverage the duality between them, and current methods for each task are hindered by challenges in modeling 3D information and the limitations of available data. To address these issues, we propose 3DMolFormer, a unified dual-channel transformer-based framework applicable to both docking and 3D drug design tasks, which exploits their duality by utilizing docking functionalities within the drug design process. Specifically, we represent 3D pocket-ligand complexes using parallel sequences of discrete tokens and continuous numbers, and we design a corresponding dual-channel transformer model to handle this format, thereby overcoming the challenges of 3D information modeling. Additionally, we alleviate data limitations through large-scale pre-training on a mixed dataset, followed by supervised and reinforcement learning fine-tuning techniques respectively tailored for the two tasks. Experimental results demonstrate that 3DMolFormer outperforms previous approaches in both protein-ligand docking and pocket-aware 3D drug design, highlighting its promising application in structure-based drug discovery. The code is available at: https://github.com/HXYfighter/3DMolFormer .

2025-04-22

International Conference on Learning Representations (Accept (Poster))

doi.org

openreview.net

Adaptive Teachers for Amortized Samplers

Sungsoo Ahn

Jinkyoo Park

Nikolay Malkin

Yoshua Bengio

Amortized inference is the task of training a parametric model, such as a neural network, to approximate a distribution with a given unnorma… (see more)lized density where exact sampling is intractable. When sampling is implemented as a sequential decision-making process, reinforcement learning (RL) methods, such as generative flow networks, can be used to train the sampling policy. Off-policy RL training facilitates the discovery of diverse, high-reward candidates, but existing methods still face challenges in efficient exploration. We propose to use an adaptive training distribution (the \teacher) to guide the training of the primary amortized sampler (the \student). The \teacher, an auxiliary behavior model, is trained to sample high-loss regions of the \student and can generalize across unexplored modes, thereby enhancing mode coverage by providing an efficient training curriculum. We validate the effectiveness of this approach in a synthetic environment designed to present an exploration challenge, two diffusion-based sampling tasks, and four biochemical discovery tasks demonstrating its ability to improve sample efficiency and mode coverage. Source code is available at https://github.com/alstn12088/adaptive-teacher.

2025-04-22

International Conference on Learning Representations (Accept (Poster))

doi.org

openreview.net

Algorithmic Fairness Through the Lens of Metrics and Evaluation (AFME) 2024

Miriam Rateike

Awa Dieng

Jamelle Watson-Daniels

Ferdinando Fioretto

Golnoosh Farnadi

2025-04-22

Proceedings of the Algorithmic Fairness Through the Lens of Metrics and Evaluation (published)

proceedings.mlr.press

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications