Publications

Accelerated Benders Decomposition and Local Branching for Dynamic Maximum Covering Location Problems
Steven Lamontagne
Ribal Atallah
The maximum covering location problem (MCLP) is a key problem in facility location, with many applications and variants. One such variant is… (see more) the dynamic (or multi-period) MCLP, which considers the installation of facilities across multiple time periods. To the best of our knowledge, no exact solution method has been proposed to tackle large-scale instances of this problem. To that end, in this work, we expand upon the current state-of-the-art branch-and-Benders-cut solution method in the static case, by exploring several acceleration techniques. Additionally, we propose a specialised local branching scheme, that uses a novel distance metric in its definition of subproblems and features a new method for efficient and exact solving of the subproblems. These methods are then compared through extensive computational experiments, highlighting the strengths of the proposed methodologies.
A logistics provider’s profit maximization facility location problem with random utility maximizing followers
David Pinzon Ulloa
Bernard Gendron
Many-Shot In-Context Learning
Rishabh Agarwal
Avi Singh
Lei M Zhang
Bernd Bohnet
Luis Rosias
Ankesh Anand
Stephanie C.Y. Chan
Zaheer Abbas
Biao Zhang
Azade Nova
Aleksandra Faust
John D. Co-Reyes
Eric Chu
Feryal M. P. Behbahani
Large language models (LLMs) excel at few-shot in-context learning (ICL) -- learning from a few examples provided in context at inference, w… (see more)ithout any weight updates. Newly expanded context windows allow us to investigate ICL with hundreds or thousands of examples -- the many-shot regime. Going from few-shot to many-shot, we observe significant performance gains across a wide variety of generative and discriminative tasks. While promising, many-shot ICL can be bottlenecked by the available amount of human-generated examples. To mitigate this limitation, we explore two new settings: Reinforced and Unsupervised ICL. Reinforced ICL uses model-generated chain-of-thought rationales in place of human examples. Unsupervised ICL removes rationales from the prompt altogether, and prompts the model only with domain-specific questions. We find that both Reinforced and Unsupervised ICL can be quite effective in the many-shot regime, particularly on complex reasoning tasks. Finally, we demonstrate that, unlike few-shot learning, many-shot learning is effective at overriding pretraining biases and can learn high-dimensional functions with numerical inputs. Our analysis also reveals the limitations of next-token prediction loss as an indicator of downstream ICL performance.
Using neural biomarkers to personalize dosing of vagus nerve stimulation
Antonin Berthon
Lorenz Wernisch
Myrta Stoukidi
Michael Thornton
Olivier Tessier-Lariviere
Pascal Fortier-Poisson
Jorin Mamen
Max Pinkney
Susannah Lee
Elvijs Sarkans
Luca Annecchino
Ben Appleton
Philip Garsed
Bret Patterson
Samuel Gonshaw
Matjaž Jakopec
Sudhakaran Shunmugam
Tristan Edwards
Aleksi Tukiainen
Joel Jennings … (see 3 more)
Emil Hewage
Oliver Armitage
GIST: Generated Inputs Sets Transferability in Deep Learning
Florian Tambon
Giuliano Antoniol
Promoting Exploration in Memory-Augmented Adam using Critical Momenta
Pranshu Malviya
Goncalo Mordido
Aristide Baratin
Reza Babanezhad Harikandeh
Jerry Huang
Razvan Pascanu
Adaptive gradient-based optimizers, particularly Adam, have left their mark in training large-scale deep learning models. The strength of su… (see more)ch optimizers is that they exhibit fast convergence while being more robust to hyperparameter choice. However, they often generalize worse than non-adaptive methods. Recent studies have tied this performance gap to flat minima selection: adaptive methods tend to find solutions in sharper basins of the loss landscape, which in turn hurts generalization. To overcome this issue, we propose a new memory-augmented version of Adam that promotes exploration towards flatter minima by using a buffer of critical momentum terms during training. Intuitively, the use of the buffer makes the optimizer overshoot outside the basin of attraction if it is not wide enough. We empirically show that our method improves the performance of several variants of Adam on standard supervised language modelling and image classification tasks.
DeCoDEx: Confounder Detector Guidance for Improved Diffusion-based Counterfactual Explanations
Nima Fathi
Amar Kumar
Brennan Nichyporuk
Mohammad Havaei
Deep learning classifiers are prone to latching onto dominant confounders present in a dataset rather than on the causal markers associated … (see more)with the target class, leading to poor generalization and biased predictions. Although explainability via counterfactual image generation has been successful at exposing the problem, bias mitigation strategies that permit accurate explainability in the presence of dominant and diverse artifacts remain unsolved. In this work, we propose the DeCoDEx framework and show how an external, pre-trained binary artifact detector can be leveraged during inference to guide a diffusion-based counterfactual image generator towards accurate explainability. Experiments on the CheXpert dataset, using both synthetic artifacts and real visual artifacts (support devices), show that the proposed method successfully synthesizes the counterfactual images that change the causal pathology markers associated with Pleural Effusion while preserving or ignoring the visual artifacts. Augmentation of ERM and Group-DRO classifiers with the DeCoDEx generated images substantially improves the results across underrepresented groups that are out of distribution for each class. The code is made publicly available at https://github.com/NimaFathi/DeCoDEx.
Characterizing and Classifying Developer Forum Posts with their Intentions
Xingfang Wu
Eric Laufer
Heng Li
Santhosh Srinivasan
Jayden Luo
With the rapid growth of the developer community, the amount of posts on online technical forums has been growing rapidly, which poses diffi… (see more)culties for users to filter useful posts and find important information. Tags provide a concise feature dimension for users to locate their interested posts and for search engines to index the most relevant posts according to the queries. However, most tags are only focused on the technical perspective (e.g., program language, platform, tool). In most cases, forum posts in online developer communities reveal the author's intentions to solve a problem, ask for advice, share information, etc. The modeling of the intentions of posts can provide an extra dimension to the current tag taxonomy. By referencing previous studies and learning from industrial perspectives, we create a refined taxonomy for the intentions of technical forum posts. Through manual labeling and analysis on a sampled post dataset extracted from online forums, we understand the relevance between the constitution of posts (code, error messages) and their intentions. Furthermore, inspired by our manual study, we design a pre-trained transformer-based model to automatically predict post intentions. The best variant of our intention prediction framework, which achieves a Micro F1-score of 0.589, Top 1-3 accuracy of 62.6% to 87.8%, and an average AUC of 0.787, outperforms the state-of-the-art baseline approach. Our characterization and automated classification of forum posts regarding their intentions may help forum maintainers or third-party tool developers improve the organization and retrieval of posts on technical forums. We have released our annotated dataset and codes in our supplementary material package.
Implementation of a Global Pediatric Trauma Course in an Upper Middle–Income Country: A Pilot Study
Abbie Naus
Madeleine Carroll
Ayla Gerk
David P. Mooney
Natalie L. Yanchar
Julia Ferreira
Karen E. Gripp
Caroline Ouellet
Fabio Botelho
On the Costs and Benefits of Adopting Lifelong Learning for Software Analytics -- Empirical Study on Brown Build and Risk Prediction
Doriane Olewicki
Sarra Habchi
Mathieu Nayrolles
Mojtaba Faramarzi
Bram Adams
Nowadays, software analytics tools using machine learning (ML) models to, for example, predict the risk of a code change are well establishe… (see more)d. However, as the goals of a project shift over time, and developers and their habits change, the performance of said models tends to degrade (drift) over time. Current retraining practices typically require retraining a new model from scratch on a large updated dataset when performance decay is observed, thus incurring a computational cost; also there is no continuity between the models as the past model is discarded and ignored during the new model training. Even though the literature has taken interest in online learning approaches, those have rarely been integrated and evaluated in industrial environments. This paper evaluates the use of lifelong learning (LL) for industrial use cases at Ubisoft, evaluating both the performance and the required computational effort in comparison to the retraining-from-scratch approaches commonly used by the industry. LL is used to continuously build and maintain ML-based software analytics tools using an incremental learner that progressively updates the old model using new data. To avoid so-called"catastrophic forgetting"of important older data points, we adopt a replay buffer of older data, which still allows us to drastically reduce the size of the overall training dataset, and hence model training time.
Structured Learning in Time-dependent Cox Models
Guanbo Wang
Yi Lian
Robert W. Platt
Rui Wang
Sylvie Perreault
Marc Dorais
Mireille E. Schnitzer
Stimulus information guides the emergence of behavior-related signals in primary somatosensory cortex during learning.
Mariangela Panniello
Colleen J Gillon
Roberto Maffulli
Marco Celotto
Stefano Panzeri
Michael M Kohl