Publications

MiRGraph: A hybrid deep learning approach to identify microRNA-target interactions by integrating heterogeneous regulatory network and genomic sequences

Pei Liu

Yang Liu

Jiawei Luo

Yuemei Li

2024-10-01

bioRxiv (preprint)

doi.org

VinePPO: Refining Credit Assignment in RL Training of LLMs

Amirhossein Kazemnejad

Large language models (LLMs) are increasingly applied to complex reasoning tasks that require executing several complex steps before receivi… (see more)ng any reward. Properly assigning credit to these steps is essential for enhancing model performance. Proximal Policy Optimization (PPO), a common reinforcement learning (RL) algorithm used for LLM finetuning, employs value networks to tackle credit assignment. However, recent approaches achieve strong results without it, raising questions about the efficacy of value networks in practice. In this work, we systematically evaluate the efficacy of value networks and reveal their significant shortcomings in reasoning-heavy LLM tasks, showing that they often produce poor estimate of expected return and barely outperform a random baseline when comparing alternative steps. This motivates our key question: Can improved credit assignment enhance RL training for LLMs? To address this, we propose VinePPO, a straightforward approach that leverages the flexibility of language environments to compute unbiased Monte Carlo-based estimates. Our method consistently outperforms PPO and other baselines across MATH and GSM8K datasets in less wall-clock time (up to 3.0x). Crucially, it achieves higher test accuracy for a given training accuracy, capturing more generalization signal per sample. These results emphasize the importance of accurate credit assignment in RL training of LLM.

2024-10-01

arXiv (preprint)

doi.org

proceedings.mlr.press

Were RNNs All We Needed?

Leo Feng

Frederick Tung

Mohamed Osama Ahmed

Yoshua Bengio

Hossein Hajimirsadeghi

The introduction of Transformers in 2017 reshaped the landscape of deep learning. Originally proposed for sequence modelling, Transformers h… (see more)ave since achieved widespread success across various domains. However, the scalability limitations of Transformers - particularly with respect to sequence length - have sparked renewed interest in novel recurrent models that are parallelizable during training, offer comparable performance, and scale more effectively. In this work, we revisit sequence modelling from a historical perspective, focusing on Recurrent Neural Networks (RNNs), which dominated the field for two decades before the rise of Transformers. Specifically, we examine LSTMs (1997) and GRUs (2014). We demonstrate that by simplifying these models, we can derive minimal versions (minLSTMs and minGRUs) that (1) use fewer parameters than their traditional counterparts, (2) are fully parallelizable during training, and (3) achieve surprisingly competitive performance on a range of tasks, rivalling recent models including Transformers.

2024-10-01

ArXiv (preprint)

doi.org

arxiv.org

Were RNNs All We Needed?

Leo Feng

Frederick Tung

Mohamed Osama Ahmed

Yoshua Bengio

Hossein Hajimirsadeghi

2024-10-01

ArXiv (preprint)

doi.org

arxiv.org

Challenges for impact evaluation of WHO’s normative output

Catherine Régis

Gaëlle Foucault

Jean-Louis Denis

Pierre Larouche

Miriam Cohen

2024-09-30

Bulletin of the World Health Organization (published)

doi.org

Hybrid Simulator-Based Mechanism and Data-Driven for Multidemand Dioxin Emissions Intelligent Prediction in the MSWI Process

Heng Xia

Jian Tang

Wen Yu

JunFei Qiao

2024-09-30

IEEE transactions on industrial electronics (1982. Print) (published)

doi.org

Inferring electric vehicle charging patterns from smart meter data for impact studies

Feng Li

Élodie Campeau

Ilhan Kocar

Antoine Lesage-Landry

2024-09-30

Electric power systems research (published)

doi.org

Long-term outcomes of critically ill patients with hematological malignancies: what is the impact of the coronavirus disease 2019 pandemic? Author's reply

Laveena Munshi

Guillaume Dumas

Sangeeta Mehta

2024-09-30

Intensive Care Medicine (published)

doi.org

MAP: Model Merging with Amortized Pareto Front Using Limited Computation

Li Li

Tianyu Zhang

Zhiqi Bu

Suyuchen Wang

Huan He

Jie Fu

Yonghui Wu

Jiang Bian

Yong Chen

Yoshua Bengio

2024-09-30

NeurIPS.cc/2024/Workshop/Federated_Learning (oral)

openreview.net

Single-Shot Learning of Stable Dynamical Systems for Long-Horizon Manipulation Tasks

Alexandre St-Aubin

Amin Abyaneh

Hsiu-Chin Lin

Mastering complex sequential tasks continues to pose a significant challenge in robotics. While there has been progress in learning long-hor… (see more)izon manipulation tasks, most existing approaches lack rigorous mathematical guarantees for ensuring reliable and successful execution. In this paper, we extend previous work on learning long-horizon tasks and stable policies, focusing on improving task success rates while reducing the amount of training data needed. Our approach introduces a novel method that (1) segments long-horizon demonstrations into discrete steps defined by waypoints and subgoals, and (2) learns globally stable dynamical system policies to guide the robot to each subgoal, even in the face of sensory noise and random disturbances. We validate our approach through both simulation and real-world experiments, demonstrating effective transfer from simulation to physical robotic platforms. Code is available at https://github.com/Alestaubin/stable-imitation-policy-with-waypoints

2024-09-30

ArXiv (preprint)

doi.org

arxiv.org

Spatial Action Unit Cues for Interpretable Deep Facial Expression Recognition

Soufiane Belharbi

Marco Pedersoli

Alessandro Lameiras Koerich

Simon Bacon

Eric Granger

Although state-of-the-art classifiers for facial expression recognition (FER) can achieve a high level of accuracy, they lack interpretabili… (see more)ty, an important feature for end-users. Experts typically associate spatial action units (AUs) from a codebook to facial regions for the visual interpretation of expressions. In this paper, the same expert steps are followed. A new learning strategy is proposed to explicitly incorporate AU cues into classifier training, allowing to train deep interpretable models. During training, this AU codebook is used, along with the input image expression label, and facial landmarks, to construct a AU heatmap that indicates the most discriminative image regions of interest w.r.t the facial expression. This valuable spatial cue is leveraged to train a deep interpretable classifier for FER. This is achieved by constraining the spatial layer features of a classifier to be correlated with AU heatmaps. Using a composite loss, the classifier is trained to correctly classify an image while yielding interpretable visual layer-wise attention correlated with AU maps, simulating the expert decision process. Our strategy only relies on image class expression for supervision, without additional manual annotations. Our new strategy is generic, and can be applied to any deep CNN- or transformer-based classifier without requiring any architectural change or significant additional training time. Our extensive evaluation on two public benchmarks RAF-DB, and AffectNet datasets shows that our proposed strategy can improve layer-wise interpretability without degrading classification performance. In addition, we explore a common type of interpretable classifiers that rely on class activation mapping (CAM) methods, and show that our approach can also improve CAM interpretability.

2024-09-30

ArXiv (preprint)

doi.org

arxiv.org

A Survey of Diversification Techniques in Search and Recommendation

Haolun Wu

Yansen Zhang

Chen Ma

Fuyuan Lyu

Bowei He

Fernando Diaz

Bhaskar Mitra

Xue Liu

Diversifying search results is an important research topic in retrieval systems in order to satisfy both the various interests of customers … (see more)and the equal market exposure of providers. There has been a growing attention on diversity-aware research during recent years, accompanied by a proliferation of literature on methods to promote diversity in search and recommendation. However, the diversity-aware studies in retrieval systems lack a systematic organization and are rather fragmented. In this survey, we are the first to propose a unified taxonomy for classifying the metrics and approaches of diversification in both search and recommendation, which are two of the most extensively researched fields of retrieval systems. We begin the survey with a brief discussion of why diversity is important in retrieval systems, followed by a summary of the various diversity concerns in search and recommendation, highlighting their relationship and differences. For the survey’s main body, we present a unified taxonomy of diversification metrics and approaches in retrieval systems, from both the search and recommendation perspectives. In the later part of the survey, we discuss the openness research questions of diversity-aware research in search and recommendation in an effort to inspire future innovations and encourage the implementation of diversity in real-world systems.

2024-09-30

IEEE Transactions on Knowledge and Data Engineering (published)

doi.org

arxiv.org

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Publications

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Popular keywords:

Publications