Publications

Hadamard product in deep learning: Introduction, Advances and Challenges.

Grigorios G Chrysos

Yongtao Wu

Razvan Pascanu

Philip Torr

Volkan Cevher

2024-12-31

IEEE Transactions on Pattern Analysis and Machine Intelligence (publié)

doi.org

arxiv.org

Hallucination Detox: Sensitivity Dropout (SenD) for Large Language Model Training

Shahrad Mohammadzadeh

Juan David Guerra

Marco Bonizzato

Reihaneh Rabbany

Golnoosh Farnadi

As large language models (LLMs) become increasingly prevalent, concerns about their reliability, particularly due to hallucinations - factua… (voir plus)lly inaccurate or irrelevant outputs - have grown. Our research investigates the relationship between the uncertainty in training dynamics and the emergence of hallucinations. Using models from the Pythia suite and several hallucination detection metrics, we analyze hallucination trends and identify significant variance during training. To address this, we propose Sensitivity Dropout (SenD), a novel training protocol designed to reduce hallucination variance during training by deterministically dropping embedding indices with significant variability. In addition, we develop an unsupervised hallucination detection metric, Efficient EigenScore (EES), which approximates the traditional EigenScore in 2x speed. This metric is integrated into our training protocol, allowing SenD to be both computationally scalable and effective at reducing hallucination variance. SenD improves test-time reliability of Pythia and Meta's Llama models by up to 17% and enhances factual accuracy in Wikipedia, Medical, Legal, and Coding domains without affecting downstream task performance.

2024-12-31

ACL (1) (publié)

doi.org

openreview.net

https://www.neuromodec.org/journal/4/2/NzBlvmDpUYspQQbvI4B Online Transcranial Random Noise Stimulation of the Right Temporoparietal Junction Acutely Modulates Human-Machine Social Interactions

Vincent Chamberland

Quentin Moreau

Lisane Moses

Gabriela Milanova

Guillaume Dumas

2024-12-31

Neuromodec Journal (publié)

doi.org

ICLR 2025 Workshop on Tackling Climate Change with Machine Learning: Data-Centric Approaches in ML for Climate Action

Konstantin Klemmer

Melissa Chapman

Lily Xu

Poon Kin Ho

Mélisande Teng

Patrick Emami

Yoshua Bengio

Climate change is one of the greatest problems society has ever faced, with increasingly severe consequences for humanity as natural disaste… (voir plus)rs multiply, sea levels rise, and ecosystems falter. While no silver bullet, machine learning can be an invaluable tool in fighting climate change via a wide array of applications and techniques, from designing smart electric grids to tracking greenhouse gas emissions through satellite imagery. These applications require algorithmic innovations in machine learning and close collaboration with diverse fields and practitioners. This workshop is intended as a forum for those in the global machine learning community who wish to help tackle climate change, and is further aimed to help foster cross-pollination between researchers in machine learning and experts in complementary climate-relevant fields. Building on our past workshops on this topic, this workshop particularly aims to explore data-centric ML approaches for climate action. Data-centric ML is not only a timely topic within the ICLR community, as analyzing and engineering (pre)training datasets becomes increasingly important, but holds specific challenges and opportunities in climate-related areas. We also want to take the opportunity of ICLR being hosted in Singapore to engage with local communities and shine a light on work that deploys, analyzes or critiques ML methods and their use for climate change adaptation and mitigation on the Asian continent.

2024-12-31

ICLR.cc/2025/Workshop_Proposals (publié)

openreview.net

An identification of models to help in the design of national strategies and policies to reduce greenhouse gas emissions.

Danielle Maia de Souza

Radhwane Boukelouha

Emma Frejinger

Catherine Morency

Normand Mousseau

Martin Trépanier

2024-12-31

Transportation Research Procedia (publié)

doi.org

Implicit Diffusion: Efficient Optimization through Stochastic Sampling

Pierre Marion

Anna Korba

Peter Bartlett

Mathieu Blondel

Valentin De Bortoli

Arnaud Doucet

Felipe Llinares-López

Courtney Paquette

Quentin Berthet

2024-12-31

AISTATS (publié)

doi.org

proceedings.mlr.press

Improving Robustness and Reliability in Medical Image Classification with Latent-Guided Diffusion and Nested-Ensembles

Xing Shen

Hengguan Huang

Brennan Nichyporuk

Tal Arbel

Once deployed, medical image analysis methods are often faced with unexpected image corruptions and noise perturbations. These unknown covar… (voir plus)iate shifts present significant challenges to deep learning based methods trained on "clean" images. This often results in unreliable predictions and poorly calibrated confidence, hence hindering clinical applicability. While recent methods have been developed to address specific issues such as confidence calibration or adversarial robustness, no single framework effectively tackles all these challenges simultaneously. To bridge this gap, we propose LaDiNE, a novel ensemble learning method combining the robustness of Vision Transformers with diffusion-based generative models for improved reliability in medical image classification. Specifically, transformer encoder blocks are used as hierarchical feature extractors that learn invariant features from images for each ensemble member, resulting in features that are robust to input perturbations. In addition, diffusion models are used as flexible density estimators to estimate member densities conditioned on the invariant features, leading to improved modeling of complex data distributions while retaining properly calibrated confidence. Extensive experiments on tuberculosis chest X-rays and melanoma skin cancer datasets demonstrate that LaDiNE achieves superior performance compared to a wide range of state-of-the-art methods by simultaneously improving prediction accuracy and confidence calibration under unseen noise, adversarial perturbations, and resolution degradation.

2024-12-31

IEEE Transactions on Medical Imaging (inconnu)

doi.org

arxiv.org

Incorporating Spatial Information into Goal-Conditioned Hierarchical Reinforcement Learning via Graph Representations

Shuyuan Zhang

Zihan Wang

Xiao-Wen Chang

Doina Precup

The integration of graphs with Goal-conditioned Hierarchical Reinforcement Learning (GCHRL) has recently gained attention, as intermediate g… (voir plus)oals (subgoals) can be effectively sampled from graphs that naturally represent the overall task structure in most RL tasks. However, existing approaches typically rely on domain-specific knowledge to construct these graphs, limiting their applicability to new tasks. Other graph-based approaches create graphs dynamically during exploration but struggle to fully utilize them, because they have problems passing the information in the graphs to newly visited states. Additionally, current GCHRL methods face challenges such as sample inefficiency and poor subgoal representation. This paper proposes a solution to these issues by developing a graph encoder-decoder to evaluate unseen states. Our proposed method, Graph-Guided sub-Goal representation Generation RL (G4RL), can be incorporated into any existing GCHRL method when operating in environments with primarily symmetric and reversible transitions to enhance performance across this class of problems. We show that the graph encoder-decoder can be effectively implemented using a network trained on the state graph generated during exploration. Empirical results indicate that leveraging high and low-level intrinsic rewards from the graph encoder-decoder significantly enhances the performance of state-of-the-art GCHRL approaches with an extra small computational cost in dense and sparse reward environments.

2024-12-31

Trans. Mach. Learn. Res. (publié)

doi.org

openreview.net

Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models

Yinlam Chow

Guy Tennenholtz

Izzeddin Gur

Vincent Zhuang

Bo Dai

Sridhar Thiagarajan

Craig Boutilier

Rishabh Agarwal

Aviral Kumar

Aleksandra Faust

Recent studies have indicated that effectively utilizing inference-time compute is crucial for attaining better performance from large langu… (voir plus)age models (LLMs). In this work, we propose a novel inference-aware fine-tuning paradigm, in which the model is fine-tuned in a manner that directly optimizes the performance of the inference-time strategy. We study this paradigm using the simple yet effective Best-of-N (BoN) inference strategy, in which a verifier selects the best out of a set of LLM-generated responses. We devise the first imitation learning and reinforcement learning~(RL) methods for BoN-aware fine-tuning, overcoming the challenging, non-differentiable argmax operator within BoN. We empirically demonstrate that our BoN-aware models implicitly learn a meta-strategy that interleaves best responses with more diverse responses that might be better suited to a test-time input -- a process reminiscent of the exploration-exploitation trade-off in RL. Our experiments demonstrate the effectiveness of BoN-aware fine-tuning in terms of improved performance and inference-time compute. In particular, we show that our methods improve the Bo32 performance of Gemma 2B on Hendrycks MATH from 26.8% to 30.8%, and pass@32 from 60.0% to 67.0%, as well as the pass@16 on HumanEval from 61.6% to 67.1%.

2024-12-31

ICLR (publié)

doi.org

arxiv.org

Insights into heart failure metabolite markers through explainable machine learning

Cantin Baron

Pamela Mehanna

Caroline Daneault

Leslie Hausermann

David Busseuil

Jean-Claude Tardif

Jocelyn Dupuis

Christine Des Rosiers

Matthieu Ruiz

Julie G. Hussin

Understanding molecular traits through metabolomics offers an avenue to tailor cardiovascular prevention, diagnosis and treatment strategies… (voir plus) more effectively. This study focuses on the application of machine learning (ML) and explainable artificial intelligence (XAI) algorithms to detect discriminant molecular signatures in heart failure (HF). We aim to uncover metabolites with significant predictive value by analyzing targeted metabolomics data through ML and XAI algorithms. After quality control, we analyzed 55 metabolites from 124 plasma samples, including 53 HF patients and 71 controls, comparing Ridge Logistic Regression, Support Vector Machine and eXtreme Gradient Boosting models. All achieved high accuracy in predicting group labels: 84.0% [95% CI: 75.3 — 92.7], 85.73 [95% CI: 78.6 — 92.9], and 84.8% [95% CI: 76.1 – 93.5], respectively. Permutation-based variable importance and Local Interpretable Model-agnostic Explanations (LIME) were used for group-level and individual-level explainability, respectively, complemented by H-Friedman statistics for variable interactions, yielding reliable, explainable insights of the ML models. Metabolites well-known for their association with HF, such as glucose and cholesterol, and more recently described, the C18:1 carnitine, were reaffirmed in our analysis. The novel discovery of lignoceric acid (C24:0 fatty acid) as a critical discriminator, was confirmed in a replication cohort, underscoring its potential as a metabolite marker. Furthermore, our study highlights the utility of 2-way variable interaction analysis in unveiling a network of metabolite interactions essential for accurate disease prediction. The results demonstrate our approach's efficacy in identifying key metabolites and their interactions, illustrating the power of ML and XAI in advancing personalized healthcare solutions.

2024-12-31

Computational and Structural Biotechnology Journal (publié)

doi.org

Instant3dit: Multiview Inpainting for Fast Editing of 3D Objects

Amir Barda

Matheus Gadelha

Vladimir Kim

Noam Aigerman

Amit H. Bermano

Thibault Groueix

We propose a generative technique to edit 3D shapes, represented as meshes, NeRFs, or Gaussian Splats, in approximately 3 seconds, without t… (voir plus)he need for running an SDS type of optimization. Our key insight is to cast 3D editing as a multiview image inpainting problem, as this representation is generic and can be mapped back to any 3D representation using the bank of available Large Reconstruction Models. We explore different fine-tuning strategies to obtain both multiview generation and inpainting capabilities within the same diffusion model. In particular, the design of the inpainting mask is an important factor of training an inpainting model, and we propose several masking strategies to mimic the types of edits a user would perform on a 3D shape. Our approach takes 3D generative editing from hours to seconds and produces higher-quality results compared to previous works.

2024-12-31

CVPR (publié)

doi.org

arxiv.org

Integer Programming Games: A Gentle Computational Overview

Margarida Carvalho

Gabriele Dragotto

Andrea Lodi

Sriram Sankaranarayan

2024-12-31

Foundations and Trends® in Optimization (publié)

doi.org

arxiv.org

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Avantage IA : productivité dans la fonction publique

Publications

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Avantage IA : productivité dans la fonction publique

Mots-clés populaires:

Publications