Publications

Investigating the Effect of Providing Required Training to Mothers of Children with Surgery and Its Effect on Mothers' Anxiety

Julia Ferreira

Nadia Safa

Fabio Botelho

Robin Petroze

Hussein Wissanji

Dan Poenaru

Pramod Puligandla

Kenneth Shaw

Maeve Trudeau

Elena Guadagno

Jean Martin Laberge

Sherif Emil

2024-12-31

Journal of Integrative Nursing and Palliative Care (published)

doi.org

Learning active tactile perception through belief-space control

Jean-François Tremblay

Johanna Hansen

David Meger

Francois Hogan

Gregory Dudek

Robot operating in an open world can encounter novel objects with unknown physical properties, such as mass, friction, or size. It is desira… (see more)ble to be able to sense those property through contact-rich interaction, before performing downstream tasks with the objects. We propose a method for autonomously learning active tactile perception policies, by learning a generative world model leveraging a differentiable bayesian filtering algorithm, and designing an information- gathering model predictive controller. We test the method on three simulated tasks: mass estimation, height estimation and toppling height estimation. Our method is able to discover policies which gather information about the desired property in an intuitive manner.

2024-12-31

ICRA (published)

doi.org

openreview.net

A Learning-Based Framework for Fair and Scalable Solution Generation in Kidney Exchange Problems

William St-Arnaud

Margarida Carvalho

Golnoosh Farnadi

2024-12-31

Trans. Mach. Learn. Res. (published)

openreview.net

Learning-to-Optimize for Consolidation and Transshipment in Multi-store Order Delivery

Xin Wang

Okan Arslan

Jean-François Cordeau

Erick Delage

2024-12-31

SSRN Electronic Journal (accepted)

doi.org

Leveraging Vision-Language Foundation Models to Reveal Hidden Image-Attribute Relationships in Medical Imaging

Amar Kumar

Anita Kriz

Barak Pertzov

Tal Arbel

2024-12-31

CVPR Workshops (published)

doi.org

arxiv.org

Linking Facial Recognition of Emotions and Socially Shared Regulation in Medical Simulation

Xiaoshan Huang

Tianlong Zhong

Haolun Wu

Yeyu Wang

Ethan Churchill

Xue Liu

David Williamson Shaffer

Computer-supported simulation enables a practical alternative for medical training purposes. This study investigates the co-occurrence of fa… (see more)cial-recognition-derived emotions and socially shared regulation of learning (SSRL) interactions in a medical simulation training context. Using transmodal analysis (TMA), we compare novice and expert learners’ affective and cognitive engagement patterns during collaborative virtual diagnosis tasks. Results reveal that expert learners exhibit strong associations between socio-cognitive interactions and high-arousal emotions (surprise, anger), suggesting focused, effortful engagement. In contrast, novice learners demonstrate stronger links between socio-cognitive processes and happiness or sadness, with less coherent SSRL patterns, potentially indicating distraction or cognitive overload. Transmodal analysis of multimodal data (facial expressions and discourse) highlights distinct regulatory strategies between groups, offering methodological and practical insights for computer-supported cooperative work (CSCW) in medical education. Our findings underscore the role of emotion-regulation dynamics in collaborative expertise development and suggest the need for tailored scaffolding to support novice learners’ socio-cognitive and affective engagement.

2024-12-31

arXiv (preprint)

doi.org

arxiv.org

Low-Dimensional solutions for optimal control of network-coupled subsystems over a directed network

Mohamed-Amine Azzouz

Shuang Gao

Aditya Mahajan

In this paper, we investigate optimal control of network-coupled subsystems, where the coupling between the dynamics of the subsystems is re… (see more)presented by the adjacency or Laplacian matrix of a directed graph. Under the assumption that the coupling matrix is normal and the cost coupling is compatible with the dynamics coupling, we use the spectral decomposition of the coupling matrix to decompose the overall system into at most n systems with noise coupled dynamics and decoupled cost, where n is the size of the network. Furthermore, the optimal control input at each subsystem can be computed by solving n1 decoupled Riccati equations where n1 (n1 ≤ n) denotes the number of distinct eigenvalues of the coupling matrix, where complex conjugate pairs are not double-counted. A salient feature of the result is that the solution complexity depends on the number of distinct eigenvalues of the coupling matrix rather than the size of the network. Therefore, the proposed solution framework provides a scalable method for synthesizing and implementing optimal control laws for large-scale network-coupled subsystems.

2024-12-31

CDC (published)

doi.org

Machine-learning-assisted Preoperative Prediction of Pediatric Appendicitis Severity

Aylin Erman

Julia Ferreira

Waseem Abu Ashour

Elena Guadagno

Etienne St-Louis

Sherif Emil

Jackie Cheung

Dan Poenaru

2024-12-31

Journal of Pediatric Surgery (published)

doi.org

MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL

Claas Voelcker

Marcel Hussing

Eric R. Eaton

Amir-massoud Farahmand

Igor Gilitschenski

Building deep reinforcement learning (RL) agents that find a good policy with few samples has proven notoriously challenging. To achieve sam… (see more)ple efficiency, recent work has explored updating neural networks with large numbers of gradient steps for every new sample. While such high update-to-data (UTD) ratios have shown strong empirical performance, they also introduce instability to the training process. Previous approaches need to rely on periodic neural network parameter resets to address this instability, but restarting the training process is infeasible in many real-world applications and requires tuning the resetting interval. In this paper, we focus on one of the core difficulties of stable training with limited samples: the inability of learned value functions to generalize to unobserved on-policy actions. We mitigate this issue directly by augmenting the off-policy RL training process with a small amount of data generated from a learned world model. Our method, Model-Augmented Data for Temporal Difference learning (MAD-TD) uses small amounts of generated data to stabilize high UTD training and achieve competitive performance on the most challenging tasks in the DeepMind control suite. Our experiments further highlight the importance of employing a good model to generate data, MAD-TD's ability to combat value overestimation, and its practical stability gains for continued learning.

2024-12-31

ICLR (published)

doi.org

arxiv.org

Malice in Agentland: Down the Rabbit Hole of Backdoors in the AI Supply Chain

Léo Boisvert

Abhay Puri

Chandra Kiran Reddy Evuru

Nicolas Chapados

Quentin Cappart

Alexandre Lacoste

Krishnamurthy (DJ) Dvijotham

Alexandre Drouin

The practice of fine-tuning AI agents on data from their own interactions--such as web browsing or tool use--, while being a strong general … (see more)recipe for improving agentic capabilities, also introduces a critical security vulnerability within the AI supply chain. In this work, we show that adversaries can easily poison the data collection pipeline to embed hard-to-detect backdoors that are triggerred by specific target phrases, such that when the agent encounters these triggers, it performs an unsafe or malicious action. We formalize and validate three realistic threat models targeting different layers of the supply chain: 1) direct poisoning of fine-tuning data, where an attacker controls a fraction of the training traces; 2) environmental poisoning, where malicious instructions are injected into webpages scraped or tools called while creating training data; and 3) supply chain poisoning, where a pre-backdoored base model is fine-tuned on clean data to improve its agentic capabilities. Our results are stark: by poisoning as few as 2% of the collected traces, an attacker can embed a backdoor causing an agent to leak confidential user information with over 80% success when a specific trigger is present. This vulnerability holds across all three threat models. Furthermore, we demonstrate that prominent safeguards, including two guardrail models and one weight-based defense, fail to detect or prevent the malicious behavior. These findings highlight an urgent threat to agentic AI development and underscore the critical need for rigorous security vetting of data collection processes and end-to-end model supply chains.

2024-12-31

arXiv (preprint)

doi.org

arxiv.org

Maximizing Data and Hardware Reuse for HLS with Early-Stage Symbolic Partitioning.

Tzung-Han Juang

Christophe Dubach

While traditional High-Level Synthesis (HLS) converts “high-level” C-like programs into hardware automatically, producing high-performan… (see more)ce designs still requires hardware expertise. Optimizations such as data partitioning can have a large impact on performance since they directly affect data reuse patterns and the ability to reuse hardware. However, optimizing partitioning is a difficult process since minor changes in the parameter choices can lead to totally unpredictable performance. Functional array-based languages have been proposed instead of C-based approaches, as they offer stronger performance guarantees. This article proposes to follow a similar approach and exposes a divide-and-conquer primitive at the algorithmic level to let users partition any arbitrary computation. The compiler is then free to explore different partition shapes to maximize both data and hardware reuse automatically. The main challenge remains that the impact of partitioning is only known much later in the compilation flow. This is due to the hard-to-predict effects of the many optimizations applied during compilation. To solve this problem, the partitioning is expressed using a set of symbolic tunable parameters, introduced early in the compilation pipeline. A symbolic performance model is then used in the last compilation stage to predict performance based on the possible values of the tunable parameters. Using this approach, a design space exploration is conducted on an Intel Arria 10 Field Programmable Gate Arrays (FPGAs), and competitive performance is achieved on the classical VGG and TinyYolo neural networks.

2024-12-31

ACM Trans. Archit. Code Optim. (published)

doi.org

Meta-learning how to Share Credit among Macro-Actions

Ionel-Alexandru Hosu

Traian Rebedea

Razvan Pascanu

One proposed mechanism to improve exploration in reinforcement learning is through the use of macro-actions. Paradoxically though, in many s… (see more)cenarios the naive addition of macro-actions does not lead to better exploration, but rather the opposite. It has been argued that this was caused by adding non-useful macros and multiple works have focused on mechanisms to discover effectively environment-specific useful macros. In this work, we take a slightly different perspective. We argue that the difficulty stems from the trade-offs between reducing the average number of decisions per episode versus increasing the size of the action space. Namely, one typically treats each potential macro-action as independent and atomic, hence strictly increasing the search space and making typical exploration strategies inefficient. To address this problem we propose a novel regularization term that exploits the relationship between actions and macro-actions to improve the credit assignment mechanism by reducing the effective dimension of the action space and, therefore, improving exploration. The term relies on a similarity matrix that is meta-learned jointly with learning the desired policy. We empirically validate our strategy looking at macro-actions in Atari games, and the StreetFighter II environment. Our results show significant improvements over the Rainbow-DQN baseline in all environments. Additionally, we show that the macro-action similarity is transferable to related environments. We believe this work is a small but important step towards understanding how the similarity-imposed geometry on the action space can be exploited to improve credit assignment and exploration, therefore making learning more effective.

2024-12-31

arXiv (preprint)

doi.org

openreview.net

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Publications

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Popular keywords:

Publications