Publications

Monte Carlo Tree Diffusion for System 2 Planning

Jaesik Yoon

Hyeonseo Cho

Doojin Baek

Sungjin Ahn

Diffusion models have recently emerged as a powerful tool for planning. However, unlike Monte Carlo Tree Search (MCTS)-whose performance nat… (voir plus)urally improves with additional test-time computation (TTC), standard diffusion-based planners offer only limited avenues for TTC scalability. In this paper, we introduce Monte Carlo Tree Diffusion (MCTD), a novel framework that integrates the generative strength of diffusion models with the adaptive search capabilities of MCTS. Our method reconceptualizes denoising as a tree-structured process, allowing partially denoised plans to be iteratively evaluated, pruned, and refined. By selectively expanding promising trajectories while retaining the flexibility to revisit and improve suboptimal branches, MCTD achieves the benefits of MCTS such as controlling exploration-exploitation trade-offs within the diffusion framework. Empirical results on challenging long-horizon tasks show that MCTD outperforms diffusion baselines, yielding higher-quality solutions as TTC increases.

2025-02-11

ArXiv (prépublication)

arxiv.org

Monte Carlo Tree Diffusion for System 2 Planning

Jaesik Yoon

Hyeonseo Cho

Doojin Baek

Yoshua Bengio

Sungjin Ahn

Diffusion models have recently emerged as a powerful tool for planning. However, unlike Monte Carlo Tree Search (MCTS)-whose performance nat… (voir plus)urally improves with additional test-time computation (TTC), standard diffusion-based planners offer only limited avenues for TTC scalability. In this paper, we introduce Monte Carlo Tree Diffusion (MCTD), a novel framework that integrates the generative strength of diffusion models with the adaptive search capabilities of MCTS. Our method reconceptualizes denoising as a tree-structured process, allowing partially denoised plans to be iteratively evaluated, pruned, and refined. By selectively expanding promising trajectories while retaining the flexibility to revisit and improve suboptimal branches, MCTD achieves the benefits of MCTS such as controlling exploration-exploitation trade-offs within the diffusion framework. Empirical results on challenging long-horizon tasks show that MCTD outperforms diffusion baselines, yielding higher-quality solutions as TTC increases.

2025-02-11

ArXiv (prépublication)

doi.org

arxiv.org

Personalized Negative Reservoir for Incremental Learning in Recommender Systems

Antonios Valkanas

Yuening Wang

Yingxue Zhang

Mark Coates

2025-02-11

TMLR (accepté)

doi.org

openreview.net

ReTreever: Tree-based Coarse-to-Fine Representations for Retrieval

Shubham Gupta

Zichao Li

Tianyi Chen

Cem Subakan

Siva Reddy

Perouz Taslakian

Valentina Zantedeschi

2025-02-11

ArXiv (prépublication)

doi.org

arxiv.org

ReTreever: Tree-based Coarse-to-Fine Representations for Retrieval

Shubham Gupta

Zichao Li

Tianyi Chen

Cem Subakan

Siva Reddy

Perouz Taslakian

Valentina Zantedeschi

Document retrieval is a core component of question-answering systems, as it enables conditioning answer generation on new and large-scale co… (voir plus)rpora. While effective, the standard practice of encoding documents into high-dimensional embeddings for similarity search entails large memory and compute footprints, and also makes it hard to inspect the inner workings of the system. In this paper, we propose a tree-based method for organizing and representing reference documents at various granular levels, which offers the flexibility to balance cost and utility, and eases the inspection of the corpus content and retrieval operations. Our method, called ReTreever, jointly learns a routing function per internal node of a binary tree such that query and reference documents are assigned to similar tree branches, hence directly optimizing for retrieval performance. Our evaluations show that ReTreever generally preserves full representation accuracy. Its hierarchical structure further provides strong coarse representations and enhances transparency by indirectly learning meaningful semantic groupings. Among hierarchical retrieval methods, ReTreever achieves the best retrieval accuracy at the lowest latency, proving that this family of techniques can be viable in practical applications.

2025-02-11

ArXiv (prépublication)

arxiv.org

Amortized In-Context Bayesian Posterior Estimation

Sarthak Mittal

N. L. Bracher

Guillaume Lajoie

Priyank Jaini

Marcus Brubaker

Bayesian inference provides a natural way of incorporating prior beliefs and assigning a probability measure to the space of hypotheses. Cur… (voir plus)rent solutions rely on iterative routines like Markov Chain Monte Carlo (MCMC) sampling and Variational Inference (VI), which need to be re-run whenever new observations are available. Amortization, through conditional estimation, is a viable strategy to alleviate such difficulties and has been the guiding principle behind simulation-based inference, neural processes and in-context methods using pre-trained models. In this work, we conduct a thorough comparative analysis of amortized in-context Bayesian posterior estimation methods from the lens of different optimization objectives and architectural choices. Such methods train an amortized estimator to perform posterior parameter inference by conditioning on a set of data examples passed as context to a sequence model such as a transformer. In contrast to language models, we leverage permutation invariant architectures as the true posterior is invariant to the ordering of context examples. Our empirical study includes generalization to out-of-distribution tasks, cases where the assumed underlying model is misspecified, and transfer from simulated to real problems. Subsequently, it highlights the superiority of the reverse KL estimator for predictive problems, especially when combined with the transformer architecture and normalizing flows.

2025-02-10

ArXiv (prépublication)

arxiv.org

Amortized In-Context Bayesian Posterior Estimation

Sarthak Mittal

N. L. Bracher

Guillaume Lajoie

Priyank Jaini

Marcus Brubaker

Bayesian inference provides a natural way of incorporating prior beliefs and assigning a probability measure to the space of hypotheses. Cur… (voir plus)rent solutions rely on iterative routines like Markov Chain Monte Carlo (MCMC) sampling and Variational Inference (VI), which need to be re-run whenever new observations are available. Amortization, through conditional estimation, is a viable strategy to alleviate such difficulties and has been the guiding principle behind simulation-based inference, neural processes and in-context methods using pre-trained models. In this work, we conduct a thorough comparative analysis of amortized in-context Bayesian posterior estimation methods from the lens of different optimization objectives and architectural choices. Such methods train an amortized estimator to perform posterior parameter inference by conditioning on a set of data examples passed as context to a sequence model such as a transformer. In contrast to language models, we leverage permutation invariant architectures as the true posterior is invariant to the ordering of context examples. Our empirical study includes generalization to out-of-distribution tasks, cases where the assumed underlying model is misspecified, and transfer from simulated to real problems. Subsequently, it highlights the superiority of the reverse KL estimator for predictive problems, especially when combined with the transformer architecture and normalizing flows.

2025-02-10

ArXiv (prépublication)

doi.org

arxiv.org

FairDropout: Using Example-Tied Dropout to Enhance Generalization of Minority Groups

G'eraldin Nanfack

Eugene Belilovsky

2025-02-10

ArXiv (prépublication)

doi.org

arxiv.org

FairDropout: Using Example-Tied Dropout to Enhance Generalization of Minority Groups

G'eraldin Nanfack

Eugene Belilovsky

Deep learning models frequently exploit spurious features in training data to achieve low training error, often resulting in poor generaliza… (voir plus)tion when faced with shifted testing distributions. To address this issue, various methods from imbalanced learning, representation learning, and classifier recalibration have been proposed to enhance the robustness of deep neural networks against spurious correlations. In this paper, we observe that models trained with empirical risk minimization tend to generalize well for examples from the majority groups while memorizing instances from minority groups. Building on recent findings that show memorization can be localized to a limited number of neurons, we apply example-tied dropout as a method we term FairDropout, aimed at redirecting this memorization to specific neurons that we subsequently drop out during inference. We empirically evaluate FairDropout using the subpopulation benchmark suite encompassing vision, language, and healthcare tasks, demonstrating that it significantly reduces reliance on spurious correlations, and outperforms state-of-the-art methods.

2025-02-10

ArXiv (prépublication)

arxiv.org

Membership Inference Risks in Quantized Models: A Theoretical and Empirical Study

Eric Aubinais

Philippe Formont

Pablo Piantanida

Elisabeth Gassiat

Quantizing machine learning models has demonstrated its effectiveness in lowering memory and inference costs while maintaining performance l… (voir plus)evels comparable to the original models. In this work, we investigate the impact of quantization procedures on the privacy of data-driven models, specifically focusing on their vulnerability to membership inference attacks. We derive an asymptotic theoretical analysis of Membership Inference Security (MIS), characterizing the privacy implications of quantized algorithm weights against the most powerful (and possibly unknown) attacks. Building on these theoretical insights, we propose a novel methodology to empirically assess and rank the privacy levels of various quantization procedures. Using synthetic datasets, we demonstrate the effectiveness of our approach in assessing the MIS of different quantizers. Furthermore, we explore the trade-off between privacy and performance using real-world data and models in the context of molecular modeling.

2025-02-10

ArXiv (prépublication)

doi.org

arxiv.org

Membership Inference Risks in Quantized Models: A Theoretical and Empirical Study

Eric Aubinais

Philippe Formont

Pablo Piantanida

Elisabeth Gassiat

Quantizing machine learning models has demonstrated its effectiveness in lowering memory and inference costs while maintaining performance l… (voir plus)evels comparable to the original models. In this work, we investigate the impact of quantization procedures on the privacy of data-driven models, specifically focusing on their vulnerability to membership inference attacks. We derive an asymptotic theoretical analysis of Membership Inference Security (MIS), characterizing the privacy implications of quantized algorithm weights against the most powerful (and possibly unknown) attacks. Building on these theoretical insights, we propose a novel methodology to empirically assess and rank the privacy levels of various quantization procedures. Using synthetic datasets, we demonstrate the effectiveness of our approach in assessing the MIS of different quantizers. Furthermore, we explore the trade-off between privacy and performance using real-world data and models in the context of molecular modeling.

2025-02-10

ArXiv (prépublication)

arxiv.org

Outsourced diffusion sampling: Efficient posterior inference in latent spaces of generative models

Siddarth Venkatraman

Mohsin Hasan

Minsu Kim

Luca Scimeca

Marcin Sendera

Yoshua Bengio

Glen Berseth

Nikolay Malkin

Any well-behaved generative model over a variable …

2025-02-10

ArXiv (prépublication)

arxiv.org

Avantage IA

Développement du groupe d'experts de l'ONU sur l'IA

Bourse de recherche en politiques de l'IA de Mila

Avantage IA

Développement du groupe d'experts de l'ONU sur l'IA

Publications

Avantage IA

Développement du groupe d'experts de l'ONU sur l'IA

Bourse de recherche en politiques de l'IA de Mila

Avantage IA

Développement du groupe d'experts de l'ONU sur l'IA

Mots-clés populaires:

Publications