Publications

Performance Control in Early Exiting to Deploy Large Models at the Same Cost of Smaller Ones

Joao Monteiro

Valentina Zantedeschi

2024-06-19

ICML.cc/2024/Workshop/ES-FoMo-II (poster)

openreview.net

APPL: A Prompt Programming Language for Harmonious Integration of Programs and Large Language Model Prompts

Honghua Dong

Qidong Su

Yubo Gao

Zhaoyu Li

Yangjun Ruan

Gennady G. Pekhimenko

Chris J. Maddison

Xujie Si

Large Language Models (LLMs) have become increasingly capable of handling diverse tasks with the aid of well-crafted prompts and integration… (see more) of external tools, but as task complexity rises, the workflow involving LLMs can be complicated and thus challenging to implement and maintain. To address this challenge, we propose APPL, A Prompt Programming Language that acts as a bridge between computer programs and LLMs, allowing seamless embedding of prompts into Python functions, and vice versa. APPL provides an intuitive and Python-native syntax, an efficient parallelized runtime with asynchronous semantics, and a tracing module supporting effective failure diagnosis and replaying without extra costs. We demonstrate that APPL programs are intuitive, concise, and efficient through three representative scenarios: Chain-of-Thought with self-consistency (CoT-SC), ReAct tool use agent, and multi-agent chat. Experiments on three parallelizable workflows further show that APPL can effectively parallelize independent LLM calls, with a significant speedup ratio that almost matches the estimation.

2024-06-18

ArXiv (preprint)

doi.org

arxiv.org

Functional Acceleration for Policy Mirror Descent

Veronica Chelu

Doina Precup

2024-06-18

ICML.cc/2024/Workshop/ARLET (poster)

doi.org

openreview.net

GAPS phase II: development and pilot results of the global assessment in pediatric surgery, an evidence-based pediatric surgical capacity assessment tool for low-resource settings.

Yasmine Yousef

Sarah Cairo

Etienne St-Louis

Laura F. Goodman

Doulia M. Hamad

Robert Baird

Emily R. Smith

Sherif Emil

Jean Martin Laberge

Mohamed Abdelmalak

Zipporah Gathuy

Faye Evans

Maryam Ghavami Adel

Ki K. Bertille

Milind Chitnis

Leecarlo Millano

Peter Nthumba

Sergio d’Agostino

Bruno Cigliano

Luis Enrique Zea-Salazar … (see 4 more)

Emmanuel Ameh

Doruk Ozgediz

Elena Guadagno

Dan Poenaru

2024-06-18

Pediatric surgery international (Print) (published)

doi.org

Handling Delay in Reinforcement Learning Caused by Parallel Computations of Neurons

Ivan Anokhin

Rishav

Stephen Chung

Irina Rish

S Ebrahimi Kahou

Biological neural networks operate in parallel, a feature that sets them apart from artificial neural networks and can significantly enhance… (see more) inference speed. However, this parallelism introduces challenges: when each neuron operates asynchronously with a fixed execution time, an

2024-06-18

ICML.cc/2024/Workshop/ARLET (poster)

openreview.net

Realtime Reinforcement Learning: Towards Rapid Asynchronous Deployment of Large Models

Matthew D Riemer

Gopeshh Subbaraj

Glen Berseth

Irina Rish

Realtime environments change even as agents perform action inference and learning, thus requiring high interaction frequencies to effectivel… (see more)y minimize long-term regret. However, recent advances in machine learning involve larger neural networks with longer inference times, raising questions about their applicability in realtime systems where reaction time is crucial. We present an analysis of lower bounds on regret in realtime environments to show that minimizing long-term regret is generally impossible within the typical sequential interaction and learning paradigm, but often becomes possible when sufficient asynchronous compute is available. We propose novel algorithms for staggering asynchronous inference processes to ensure that actions are taken at consistent time intervals, and demonstrate that use of models with high action inference times is only constrained by the environment's effective stochasticity over the inference horizon, and not by action frequency. Our analysis shows that the number of inference processes needed scales linearly with increasing inference times while enabling use of models that are multiple orders of magnitude larger than existing approaches when learning from a realtime simulation of Game Boy games such as Pokemon and Tetris.

2024-06-18

ICML.cc/2024/Workshop/ARLET (poster)

openreview.net

A deeper look at depth pruning of LLMs

Shoaib Ahmed Siddiqui

Xin Dong

Greg Heinrich

Thomas Breuel

Jan Kautz

David M. Krueger

Pavlo Molchanov

Large Language Models (LLMs) are not only resource-intensive to train but even more costly to deploy in production. Therefore, recent work h… (see more)as attempted to prune blocks of LLMs based on cheap proxies for estimating block importance, effectively removing 10% of blocks in well-trained LLaMa-2 and Mistral 7b models without any significant degradation of downstream metrics. In this paper, we explore different block importance metrics by considering adaptive metrics such as Shapley value in addition to static ones explored in prior work. We show that *adaptive metrics exhibit a trade-off in performance between tasks i.e., improvement on one task may degrade performance on the other due to differences in the computed block influences*. Furthermore, we extend this analysis from a complete block to individual self-attention and feed-forward layers, highlighting the propensity of the self-attention layers to be more amendable to pruning, even allowing ***removal of upto 33% of the self-attention layers without incurring any performance degradation on MMLU for Mistral 7b*** (significant reduction in costly maintenance of KV-cache). Finally, we look at simple performance recovery techniques to emulate the pruned layers by training lightweight additive bias or low-rank linear adapters. *Performance recovery using emulated updates avoids performance degradation for the initial blocks (up to 5% absolute improvement on MMLU)*, which is either competitive or superior to the learning-based technique.

2024-06-17

ICML.cc/2024/Workshop/TF2M (poster)

openreview.net

A machine learning pipeline for automated insect monitoring

Aditya Jain

F. Cunha

M. J. Bunsen

L. Pasi

Anna Viklund

Maxim Larrivée

David Rolnick

Climate change and other anthropogenic factors have led to a catastrophic decline in insects, endangering both biodiversity and the ecosyste… (see more)m services on which human society depends. Data on insect abundance, however, remains woefully inadequate. Camera traps, conventionally used for monitoring terrestrial vertebrates, are now being modified for insects, especially moths. We describe a complete, open-source machine learning-based software pipeline for automated monitoring of moths via camera traps, including object detection, moth/non-moth classification, fine-grained identification of moth species, and tracking individuals. We believe that our tools, which are already in use across three continents, represent the future of massively scalable data collection in entomology.

2024-06-17

ArXiv (preprint)

doi.org

arxiv.org

Path-based reasoning for biomedical knowledge graphs with BioPathNet

Yue Hu

Svitlana Oleshko

Samuele Firmani

Zhaocheng Zhu

Hui Cheng

Maria Ulmer

Matthias Arnold

Maria Colomé-Tatché

Jian Tang

Sophie Xhonneux

Annalisa Marsico

Understanding complex interactions in biomedical networks is crucial for advancements in biomedicine, but traditional link prediction (LP) m… (see more)ethods are limited in capturing this complexity. Representation-based learning techniques improve prediction accuracy by mapping nodes to low-dimensional embeddings, yet they often struggle with interpretability and scalability. We present BioPathNet, a novel graph neural network framework based on the Neural Bellman-Ford Network (NBFNet), addressing these limitations through path-based reasoning for LP in biomedical knowledge graphs. Unlike node-embedding frameworks, BioPathNet learns representations between node pairs by considering all relations along paths, enhancing prediction accuracy and interpretability. This allows visualization of influential paths and facilitates biological validation. BioPathNet leverages a background regulatory graph (BRG) for enhanced message passing and uses stringent negative sampling to improve precision. In evaluations across various LP tasks, such as gene function annotation, drug-disease indication, synthetic lethality, and lncRNA-mRNA interaction prediction, BioPathNet consistently outperformed shallow node embedding methods, relational graph neural networks and task-specific state-of-the-art methods, demonstrating robust performance and versatility. Our study predicts novel drug indications for diseases like acute lymphoblastic leukemia (ALL) and Alzheimer’s, validated by medical experts and clinical trials. We also identified new synthetic lethality gene pairs and regulatory interactions involving lncRNAs and target genes, confirmed through literature reviews. BioPathNet’s interpretability will enable researchers to trace prediction paths and gain molecular insights, making it a valuable tool for drug discovery, personalized medicine and biology in general.

2024-06-17

bioRxiv (published)

doi.org

Scalable Approaches for a Theory of Many Minds

Maximilian Puelma Touzel

Amin Memarian

Matthew D Riemer

Andrei Mircea

Andrew Robert Williams

Elin Ahlstrand

A major challenge as we move towards building agents for real-world problems, which could involve a massive number of human and/or machine a… (see more)gents, is that we must learn to reason about the behavior of these many other agents. In this paper, we consider the problem of scaling a predictive Theory of Mind (ToM) model to a very large number of interacting agents with a fixed computational budget. Motivated by the limited diversity of agent types, existing approaches to scalable TOM learn versatile single-agent representations for quickly adapting to new agents encountered sequentially. We consider the more general setting that many agents are observed in parallel and formulate the corresponding Theory of Many Minds (ToMM) problem of estimating the joint policy. We frame the scaling behavior of solutions in terms of parameter sharing schemes and in particular propose two parameter-free architectural features that endow models with the ability to exploit action correlations: encoding a multi-agent context, and decoding through an abstracted joint action space. The increased predictive capabilities that have come with foundation models have made it easier to imagine the possibility of using these models to make simulations that imitate the behavior of many agents within complex real-world systems. Being able to perform these simulations in a general-purpose way would not only help make more capable agents, it also would be a very useful capability for applications in social science, political science, and economics.

2024-06-17

ICML.cc/2024/Workshop/Agentic_Markets (poster)

openreview.net

Assessing the Viability of Generative Modeling in Simulated Astronomical Observations

Patrick Janulewicz

Laurence Perreault-Levasseur

Tracy Webb

In this paper, we use methods for assessing the quality of generative models and apply them to a problem from the physical sciences. We turn… (see more) our attention to astrophysics, where cosmological simulations are often used to create mock observations that mimic telescope images. These simulations and their mock observations are often slow and challenging to generate, inspiring some to use generative modeling to enhance the amount of data available to study. In this work, we add realism to simulated images of galaxy clusters and use probability mass estimation to assess their fidelity compared to reality. We find that the simulations are biased compared to real observations and suggest that researchers applying generative modeling to these systems should proceed with caution.

2024-06-16

ICML.cc/2024/Workshop/SPIGM (poster)

openreview.net

Augmenting Evolutionary Models with Structure-based Retrieval

Yining Huang

Zuobai Zhang

Jian Tang

Debora Susan Marks

Pascal Notin

2024-06-16

ICML.cc/2024/Workshop/ML4LMS (poster)

openreview.net

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Publications

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Popular keywords:

Publications