Publications

Trajectory Flow Matching with Applications to Clinical Time Series Modeling

Xi Zhang

Yuan Pu

Yuki Kawamura

Andrew Loza

Yoshua Bengio

Dennis L. Shung

Alexander Tong

2024-10-27

ArXiv (preprint)

doi.org

openreview.net

In-Simulation Testing of Deep Learning Vision Models in Autonomous Robotic Manipulators

Dmytro Humeniuk

Houssem Ben Braiek

Thomas Reid

Foutse Khomh

2024-10-26

Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering (published)

doi.org

arxiv.org

ProtSCAPE: Mapping the landscape of protein conformations in molecular dynamics

Siddharth Viswanath

Dhananjay Bhaskar

David R. Johnson

João Felipe Rocha

Egbert Castro

Jackson Grady

Alex T. Grigas

Michael Perlmutter

Corey S. O'Hern

Smita Krishnaswamy

Understanding the dynamic nature of protein structures is essential for comprehending their biological functions. While significant progress… (see more) has been made in predicting static folded structures, modeling protein motions on microsecond to millisecond scales remains challenging. To address these challenges, we introduce a novel deep learning architecture, Protein Transformer with Scattering, Attention, and Positional Embedding (ProtSCAPE), which leverages the geometric scattering transform alongside transformer-based attention mechanisms to capture protein dynamics from molecular dynamics (MD) simulations. ProtSCAPE utilizes the multi-scale nature of the geometric scattering transform to extract features from protein structures conceptualized as graphs and integrates these features with dual attention structures that focus on residues and amino acid signals, generating latent representations of protein trajectories. Furthermore, ProtSCAPE incorporates a regression head to enforce temporally coherent latent representations.

2024-10-26

ArXiv (preprint)

doi.org

arxiv.org

Brain-like learning with exponentiated gradients

Jonathan Cornford

Roman Pogodin

Arna Ghosh

Kaiwen Sheng

Brendan A. Bicknell

Olivier Codol

Beverley A. Clark

Guillaume Lajoie

Blake A. Richards

Computational neuroscience relies on gradient descent (GD) for training artificial neural network (ANN) models of the brain. The advantage o… (see more)f GD is that it is effective at learning difficult tasks. However, it produces ANNs that are a poor phenomenological fit to biology, making them less relevant as models of the brain. Specifically, it violates Dale’s law, by allowing synapses to change from excitatory to inhibitory, and leads to synaptic weights that are not log-normally distributed, contradicting experimental data. Here, starting from first principles of optimisation theory, we present an alternative learning algorithm, exponentiated gradient (EG), that respects Dale’s Law and produces log-normal weights, without losing the power of learning with gradients. We also show that in biologically relevant settings EG outperforms GD, including learning from sparsely relevant signals and dealing with synaptic pruning. Altogether, our results show that EG is a superior learning algorithm for modelling the brain with ANNs.

2024-10-25

bioRxiv (preprint)

doi.org

Efficient Biological Data Acquisition through Inference Set Design

Ihor Neporozhnii

Julien Roy

Emmanuel Bengio

Jason Hartford

In drug discovery, highly automated high-throughput laboratories are used to screen a large number of compounds in search of effective drugs… (see more). These experiments are expensive, so one might hope to reduce their cost by only experimenting on a subset of the compounds, and predicting the outcomes of the remaining experiments. In this work, we model this scenario as a sequential subset selection problem: we aim to select the smallest set of candidates in order to achieve some desired level of accuracy for the system as a whole. Our key observation is that, if there is heterogeneity in the difficulty of the prediction problem across the input space, selectively obtaining the labels for the hardest examples in the acquisition pool will leave only the relatively easy examples to remain in the inference set, leading to better overall system performance. We call this mechanism inference set design, and propose the use of a confidence-based active learning solution to prune out these challenging examples. Our algorithm includes an explicit stopping criterion that interrupts the acquisition loop when it is sufficiently confident that the system has reached the target performance. Our empirical studies on image and molecular datasets, as well as a real-world large-scale biological assay, show that active learning for inference set design leads to significant reduction in experimental cost while retaining high system performance.

2024-10-24

ArXiv (preprint)

doi.org

arxiv.org

Investigating Alpha-DaRT Source Daughter Diffusion in Intra-Rectal Animal Models

Mélodie Cyr

Behnaz Behmand

Naim Chabaytah

Joud Babik

Mirta Dumancic

Joanna Li

Guillaume St-Jean

Shirin A. Enger

2024-10-24

Brachytherapy (published)

doi.org

Prediction of Final Phosphorus Content of Steel in a Scrap-Based Electric Arc Furnace Using Artificial Neural Networks

Riadh Azzaz

Valentin Hurel

Patrice Ménard

M. Jahazi

S Ebrahimi Kahou

Elmira Moosavi-Khoonsari

2024-10-24

ArXiv (preprint)

doi.org

arxiv.org

ConvNTC: Convolutional neural tensor completion for predicting the disease-related miRNA pairs and cell-related drug pairs

Pei Liu

Xiao Liang

Yuemei Li

Jiawei Luo

2024-10-23

bioRxiv (preprint)

doi.org

The roles of neural networks in language acquisition

Eva Portelance

Masoud Jasbi

How can modern neural networks like large language models be useful to the field of language acquisition, and more broadly cognitive scie… (see more)nce, if they are not a priori designed to be cognitive models? As developments towards natural language understanding and generation have improved leaps and bounds, with models like GPT-4, the question of how they can inform our understanding of human language acquisition has re-emerged. As such, it is critical to examine how in practice linking hypotheses between models and human learners can be safely established. To address these questions, we propose a model taxonomy, including four modeling approaches, each having differing goals, from exploratory hypothesis generation to hypothesis differentiation and testing. We show how the goals of these approaches align with the overarching goals of science and linguistics by connecting our taxonomy to the realist vs. instrumentalist approaches in philosophy of science. We survey recent work having adopted each of our modelling approaches and address the importance of computational modelling in language acquisition studies.

2024-10-23

Language and Linguistics Compass (published)

doi.org

Minimally Invasive Morphology Adaptation via Parameter Efficient Fine-Tuning

Michael Przystupa

Hongyao Tang

Mariano Phielipp

Santiago Miret

Martin Jagersand

Glen Berseth

Learning reinforcement learning policies to control individual robots is often computationally non-economical because minor variations in ro… (see more)bot morphology (e.g. dynamics or number of limbs) can negatively impact policy performance. This limitation has motivated morphology agnostic policy learning, in which a monolithic deep learning policy learns to generalize between robotic morphologies. Unfortunately, these policies still have sub-optimal zero-shot performance compared to end-to-end finetuning on target morphologies. This limitation has ramifications in practical robotic applications, as online finetuning large neural networks can require immense computation. In this work, we investigate \textit{parameter efficient finetuning} techniques to specialize morphology-agnostic policies to a target robot that minimizes the number of learnable parameters adapted during online learning. We compare direct finetuning, which update subsets of the base model parameters, and input-learnable approaches, which add additional parameters to manipulate inputs passed to the base model. Our analysis concludes that tuning relatively few parameters (0.01\% of the base model) can measurably improve policy performance over zero shot. These results serve a prescriptive purpose for future research for which scenarios certain PEFT approaches are best suited for adapting policy's to new robotic morphologies.

2024-10-22

corl.org/2024/Workshop/MAPoDeL (published)

openreview.net

Multilingual Hallucination Gaps in Large Language Models

Cl'ea Chataigner

Afaf Taïk

Golnoosh Farnadi

Large language models (LLMs) are increasingly used as alternatives to traditional search engines given their capacity to generate text that … (see more)resembles human language. However, this shift is concerning, as LLMs often generate hallucinations, misleading or false information that appears highly credible. In this study, we explore the phenomenon of hallucinations across multiple languages in freeform text generation, focusing on what we call multilingual hallucination gaps. These gaps reflect differences in the frequency of hallucinated answers depending on the prompt and language used. To quantify such hallucinations, we used the FactScore metric and extended its framework to a multilingual setting. We conducted experiments using LLMs from the LLaMA, Qwen, and Aya families, generating biographies in 19 languages and comparing the results to Wikipedia pages. Our results reveal variations in hallucination rates, especially between high and low resource languages, raising important questions about LLM multilingual performance and the challenges in evaluating hallucinations in multilingual freeform text generation.

2024-10-22

ArXiv (preprint)

doi.org

arxiv.org

Overcoming State and Action Space Disparities in Multi-Domain, Multi-Task Reinforcement Learning

Reginald McLean

Kai Yuan

Isaac Woungang

Nariman Farsad

Pablo Samuel Castro

Current multi-task reinforcement learning (MTRL) methods have the ability to perform a large number of tasks with a single policy. However w… (see more)hen attempting to interact with a new domain, the MTRL agent would need to be re-trained due to differences in domain dynamics and structure. Because of these limitations, we are forced to train multiple policies even though tasks may have shared dynamics, leading to needing more samples and is thus sample inefficient. In this work, we explore the ability of MTRL agents to learn in various domains with various dynamics by simultaneously learning in multiple domains, without the need to fine-tune extra policies. In doing so we find that a MTRL agent trained in multiple domains induces an increase in sample efficiency of up to 70\% while maintaining the overall success rate of the MTRL agent.

2024-10-22

corl.org/2024/Workshop/MAPoDeL (published)

openreview.net

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications