Publications

Maximum entropy GFlowNets with soft Q-learning
Generative Flow Networks (GFNs) have emerged as a powerful tool for sampling discrete objects from unnormalized distributions, offering a sc… (see more)alable alternative to Markov Chain Monte Carlo (MCMC) methods. While GFNs draw inspiration from maximum entropy reinforcement learning (RL), the connection between the two has largely been unclear and seemingly applicable only in specific cases. This paper addresses the connection by constructing an appropriate reward function, thereby establishing an exact relationship between GFNs and maximum entropy RL. This construction allows us to introduce maximum entropy GFNs, which, in contrast to GFNs with uniform backward policy, achieve the maximum entropy attainable by GFNs without constraints on the state space.
Maximum flow-based formulation for the optimal location of electric vehicle charging stations
Pierre‐Luc Parent
Miguel F. Anjos
Ribal Atallah
With the increasing effects of climate change, the urgency to step away from fossil fuels is greater than ever before. Electric vehicles (EV… (see more)s) are one way to diminish these effects, but their widespread adoption is often limited by the insufficient availability of charging stations. In this work, our goal is to expand the infrastructure of EV charging stations, in order to provide a better quality of service in terms of user satisfaction (and availability of charging stations). Specifically, our focus is directed towards urban areas. We first propose a model for the assignment of EV charging demand to stations, framing it as a maximum flow problem. This model is the basis for the evaluation of user satisfaction with a given charging infrastructure. Secondly, we incorporate the maximum flow model into a mixed‐integer linear program, where decisions on the opening of new stations and on the expansion of their capacity through additional outlets is accounted for. We showcase our methodology for the city of Montreal, demonstrating the scalability of our approach to handle real‐world scenarios. We conclude that considering both spacial and temporal variations in charging demand is meaningful when solving realistic instances.
McGill NLP Group Submission to the MRL 2024 Shared Task: Ensembling Enhances Effectiveness of Multilingual Small LMs
We present our systems for the three tasks and five languages included in the MRL 2024 Shared Task on Multilingual Multi-task Information Re… (see more)trieval: (1) Named Entity Recognition, (2) Free-form Question Answering, and (3) Multiple-choice Question Answering. For each task, we explored the impact of selecting different multilingual language models for fine-tuning across various target languages, and implemented an ensemble system that generates final outputs based on predictions from multiple fine-tuned models. All models are large language models fine-tuned on task-specific data. Our experimental results show that a more balanced dataset would yield better results. However, when training data for certain languages are scarce, fine-tuning on a large amount of English data supplemented by a small amount of “triggering data” in the target language can produce decent results.
Metric Flow Matching for Smooth Interpolations on the Data Manifold
Kacper Kapuśniak
Peter Potaptchik
Teodora Reu
Leo Zhang
Michael Bronstein
Avishek Joey Bose
Francesco Di Giovanni
Matching objectives underpin the success of modern generative models and rely on constructing conditional paths that transform a source dist… (see more)ribution into a target distribution. Despite being a fundamental building block, conditional paths have been designed principally under the assumption of Euclidean geometry, resulting in straight interpolations. However, this can be particularly restrictive for tasks such as trajectory inference, where straight paths might lie outside the data manifold, thus failing to capture the underlying dynamics giving rise to the observed marginals. In this paper, we propose Metric Flow Matching (MFM), a novel simulation-free framework for conditional flow matching where interpolants are approximate geodesics learned by minimizing the kinetic energy of a data-induced Riemannian metric. This way, the generative model matches vector fields on the data manifold, which corresponds to lower uncertainty and more meaningful interpolations. We prescribe general metrics to instantiate MFM, independent of the task, and test it on a suite of challenging problems including LiDAR navigation, unpaired image translation, and modeling cellular dynamics. We observe that MFM outperforms the Euclidean baselines, particularly achieving SOTA on single-cell trajectory prediction.
Minimax Exploiter: A Data Efficient Approach for Competitive Self-Play
Recent advances in Competitive Self-Play (CSP) have achieved, or even surpassed, human level performance in complex game environments such a… (see more)s Dota 2 and StarCraft II using Distributed Multi-Agent Reinforcement Learning (MARL). One core component of these methods relies on creating a pool of learning agents -- consisting of the Main Agent, past versions of this agent, and Exploiter Agents -- where Exploiter Agents learn counter-strategies to the Main Agents. A key drawback of these approaches is the large computational cost and physical time that is required to train the system, making them impractical to deploy in highly iterative real-life settings such as video game productions. In this paper, we propose the Minimax Exploiter, a game theoretic approach to exploiting Main Agents that leverages knowledge of its opponents, leading to significant increases in data efficiency. We validate our approach in a diversity of settings, including simple turn based games, the arcade learning environment, and For Honor, a modern video game. The Minimax Exploiter consistently outperforms strong baselines, demonstrating improved stability and data efficiency, leading to a robust CSP-MARL method that is both flexible and easy to deploy.
Mirror Descent Algorithms with Nearly Dimension-Independent Rates for Differentially-Private Stochastic Saddle-Point Problems
Tom'as Gonz'alez
Crist'obal Guzm'an
Mitigating Translationese in Low-resource Languages: The Storyboard Approach
Garry Kuwanto
Eno-Abasi Urua
Priscilla A. Amuok
Shamsuddeen Hassan Muhammad
Aremu Anuoluwapo
Verrah Akinyi Otiende
Loice Emma Nanyanga
T. Nyoike
A. D. Akpan
Nsima Ab Udouboh
Idongesit Udeme Archibong
Idara Effiong Moses
Ifeoluwatayo A. Ige
Benjamin A. Ajibade
Olumide Benjamin Awokoya
Idris Abdulmumin
Saminu Mohammad Aliyu
Ruqayya Nasir Iro
Ibrahim Ahmad
Deontae Smith … (see 4 more)
Praise-EL Michaels
Derry Tanti Wijaya
Anietie U Andy
Low-resource languages often face challenges in acquiring high-quality language data due to the reliance on translation-based methods, which… (see more) can introduce the translationese effect. This phenomenon results in translated sentences that lack fluency and naturalness in the target language. In this paper, we propose a novel approach for data collection by leveraging storyboards to elicit more fluent and natural sentences. Our method involves presenting native speakers with visual stimuli in the form of storyboards and collecting their descriptions without direct exposure to the source text. We conducted a comprehensive evaluation comparing our storyboard-based approach with traditional text translation-based methods in terms of accuracy and fluency. Human annotators and quantitative metrics were used to assess translation quality. The results indicate a preference for text translation in terms of accuracy, while our method demonstrates worse accuracy but better fluency in the language focused.
Mixture of Experts in a Mixture of RL settings
Model-based graph reinforcement learning for inductive traffic signal control
François-Xavier Devailly
Denis Larocque
Most reinforcement learning methods for adaptive-traffic-signal-control require training from scratch to be applied on any new intersection … (see more)or after any modification to the road network, traffic distribution, or behavioral constraints experienced during training. Considering 1) the massive amount of experience required to train such methods, and 2) that experience must be gathered by interacting in an exploratory fashion with real road-network-users, such a lack of transferability limits experimentation and applicability. Recent approaches enable learning policies that generalize for unseen road-network topologies and traffic distributions, partially tackling this challenge. However, the literature remains divided between the learning of cyclic (the evolution of connectivity at an intersection must respect a cycle) and acyclic (less constrained) policies, and these transferable methods 1) are only compatible with cyclic constraints and 2) do not enable coordination. We introduce a new model-based method, MuJAM, which, on top of enabling explicit coordination at scale for the first time, pushes generalization further by allowing a generalization to the controllers' constraints. In a zero-shot transfer setting involving both road networks and traffic settings never experienced during training, and in a larger transfer experiment involving the control of 3,971 traffic signal controllers in Manhattan, we show that MuJAM, using both cyclic and acyclic constraints, outperforms domain-specific baselines as well as another transferable approach.
A Model-Based Solution to the Offline Multi-Agent Reinforcement Learning Coordination Problem
Training multiple agents to coordinate is an essential problem with applications in robotics, game theory, economics, and social sciences. H… (see more)owever, most existing Multi-Agent Reinforcement Learning (MARL) methods are online and thus impractical for real-world applications in which collecting new interactions is costly or dangerous. While these algorithms should leverage offline data when available, doing so gives rise to what we call the offline coordination problem. Specifically, we identify and formalize the strategy agreement (SA) and the strategy fine-tuning (SFT) coordination challenges, two issues at which current offline MARL algorithms fail. Concretely, we reveal that the prevalent model-free methods are severely deficient and cannot handle coordination-intensive offline multi-agent tasks in either toy or MuJoCo domains. To address this setback, we emphasize the importance of inter-agent interactions and propose the very first model-based offline MARL method. Our resulting algorithm, Model-based Offline Multi-Agent Proximal Policy Optimization (MOMA-PPO) generates synthetic interaction data and enables agents to converge on a strategy while fine-tuning their policies accordingly. This simple model-based solution solves the coordination-intensive offline tasks, significantly outperforming the prevalent model-free methods even under severe partial observability and with learned world models.
Motivating Users to Attend to Privacy: A Theory-Driven Design Study
Varun Shiri
Maggie Xiong
Jinghui Cheng
Jin L.C. Guo
In modern technology environments, raising users’ privacy awareness is crucial. Existing efforts largely focused on privacy policy present… (see more)ation and failed to systematically address a radical challenge of user motivation for initiating privacy awareness. Leveraging the Protection Motivation Theory (PMT), we proposed design ideas and categories dedicated to motivating users to engage with privacy-related information. Using these design ideas, we created a conceptual prototype, enhancing the current App Store product page. Results from an online experiment and follow-up interviews showed that our design effectively motivated participants to attend to privacy issues, raising both the threat appraisal and coping appraisal, two main factors in PMT. Our work indicated that effective design should consider combining PMT components, calibrating information content, and integrating other design elements, such as visual cues and user familiarity. Overall, our study contributes valuable design considerations driven by the PMT to amplify the motivational aspect of privacy communication.
Multidomain Object Detection Framework Using Feature Domain Knowledge Distillation.
Da-Wei Jaw
Shih-Chia Huang
Zhihui Lu
Benjamin C. M. Fung
Sy-Yen Kuo
Object detection techniques have been widely studied, utilized in various works, and have exhibited robust performance on images with suffic… (see more)ient luminance. However, these approaches typically struggle to extract valuable features from low-luminance images, which often exhibit blurriness and dim appearence, leading to detection failures. To overcome this issue, we introduce an innovative unsupervised feature domain knowledge distillation (KD) framework. The proposed framework enhances the generalization capability of neural networks across both low-and high-luminance domains without incurring additional computational costs during testing. This improvement is made possible through the integration of generative adversarial networks and our proposed unsupervised KD process. Furthermore, we introduce a region-based multiscale discriminator designed to discern feature domain discrepancies at the object level rather than from the global context. This bolsters the joint learning process of object detection and feature domain distillation tasks. Both qualitative and quantitative assessments shown that the proposed method, empowered by the region-based multiscale discriminator and the unsupervised feature domain distillation process, can effectively extract beneficial features from low-luminance images, outperforming other state-of-the-art approaches in both low-and sufficient-luminance domains.