Publications

Learning active tactile perception through belief-space control

Johanna Hansen

Francois Hogan

Robot operating in an open world can encounter novel objects with unknown physical properties, such as mass, friction, or size. It is desira… (voir plus)ble to be able to sense those property through contact-rich interaction, before performing downstream tasks with the objects. We propose a method for autonomously learning active tactile perception policies, by learning a generative world model leveraging a differentiable bayesian filtering algorithm, and designing an information- gathering model predictive controller. We test the method on three simulated tasks: mass estimation, height estimation and toppling height estimation. Our method is able to discover policies which gather information about the desired property in an intuitive manner.

2024-12-31

ICRA (publié)

doi.org

openreview.net

A Learning-Based Framework for Fair and Scalable Solution Generation in Kidney Exchange Problems

William St-Arnaud

Margarida Carvalho

Golnoosh Farnadi

2024-12-31

Trans. Mach. Learn. Res. (publié)

openreview.net

Learning-to-Optimize for Consolidation and Transshipment in Multi-store Order Delivery

Xin Wang

Okan Arslan

Jean-François Cordeau

Erick Delage

2024-12-31

SSRN Electronic Journal (accepté)

doi.org

Leveraging Vision-Language Foundation Models to Reveal Hidden Image-Attribute Relationships in Medical Imaging

Amar Kumar

Anita Kriz

Barak Pertzov

Tal Arbel

2024-12-31

CVPR Workshops (publié)

doi.org

arxiv.org

Linking Facial Recognition of Emotions and Socially Shared Regulation in Medical Simulation

Xiaoshan Huang

Tianlong Zhong

Haolun Wu

Yeyu Wang

Ethan Churchill

Xue Liu

David Williamson Shaffer

Computer-supported simulation enables a practical alternative for medical training purposes. This study investigates the co-occurrence of fa… (voir plus)cial-recognition-derived emotions and socially shared regulation of learning (SSRL) interactions in a medical simulation training context. Using transmodal analysis (TMA), we compare novice and expert learners’ affective and cognitive engagement patterns during collaborative virtual diagnosis tasks. Results reveal that expert learners exhibit strong associations between socio-cognitive interactions and high-arousal emotions (surprise, anger), suggesting focused, effortful engagement. In contrast, novice learners demonstrate stronger links between socio-cognitive processes and happiness or sadness, with less coherent SSRL patterns, potentially indicating distraction or cognitive overload. Transmodal analysis of multimodal data (facial expressions and discourse) highlights distinct regulatory strategies between groups, offering methodological and practical insights for computer-supported cooperative work (CSCW) in medical education. Our findings underscore the role of emotion-regulation dynamics in collaborative expertise development and suggest the need for tailored scaffolding to support novice learners’ socio-cognitive and affective engagement.

2024-12-31

arXiv (prépublication)

doi.org

arxiv.org

Low-Dimensional solutions for optimal control of network-coupled subsystems over a directed network

Mohamed-Amine Azzouz

Shuang Gao

Aditya Mahajan

In this paper, we investigate optimal control of network-coupled subsystems, where the coupling between the dynamics of the subsystems is re… (voir plus)presented by the adjacency or Laplacian matrix of a directed graph. Under the assumption that the coupling matrix is normal and the cost coupling is compatible with the dynamics coupling, we use the spectral decomposition of the coupling matrix to decompose the overall system into at most n systems with noise coupled dynamics and decoupled cost, where n is the size of the network. Furthermore, the optimal control input at each subsystem can be computed by solving n1 decoupled Riccati equations where n1 (n1 ≤ n) denotes the number of distinct eigenvalues of the coupling matrix, where complex conjugate pairs are not double-counted. A salient feature of the result is that the solution complexity depends on the number of distinct eigenvalues of the coupling matrix rather than the size of the network. Therefore, the proposed solution framework provides a scalable method for synthesizing and implementing optimal control laws for large-scale network-coupled subsystems.

2024-12-31

CDC (publié)

doi.org

Machine-learning-assisted Preoperative Prediction of Pediatric Appendicitis Severity

Aylin Erman

Julia Ferreira

Waseem Abu Ashour

Elena Guadagno

Etienne St-Louis

Sherif Emil

Jackie Cheung

Dan Poenaru

2024-12-31

Journal of Pediatric Surgery (publié)

doi.org

MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL

Claas Voelcker

Marcel Hussing

Eric R. Eaton

Amir-massoud Farahmand

Igor Gilitschenski

Building deep reinforcement learning (RL) agents that find a good policy with few samples has proven notoriously challenging. To achieve sam… (voir plus)ple efficiency, recent work has explored updating neural networks with large numbers of gradient steps for every new sample. While such high update-to-data (UTD) ratios have shown strong empirical performance, they also introduce instability to the training process. Previous approaches need to rely on periodic neural network parameter resets to address this instability, but restarting the training process is infeasible in many real-world applications and requires tuning the resetting interval. In this paper, we focus on one of the core difficulties of stable training with limited samples: the inability of learned value functions to generalize to unobserved on-policy actions. We mitigate this issue directly by augmenting the off-policy RL training process with a small amount of data generated from a learned world model. Our method, Model-Augmented Data for Temporal Difference learning (MAD-TD) uses small amounts of generated data to stabilize high UTD training and achieve competitive performance on the most challenging tasks in the DeepMind control suite. Our experiments further highlight the importance of employing a good model to generate data, MAD-TD's ability to combat value overestimation, and its practical stability gains for continued learning.

2024-12-31

ICLR (publié)

doi.org

arxiv.org

Maximizing Data and Hardware Reuse for HLS with Early-Stage Symbolic Partitioning.

Tzung-Han Juang

Christophe Dubach

While traditional High-Level Synthesis (HLS) converts “high-level” C-like programs into hardware automatically, producing high-performan… (voir plus)ce designs still requires hardware expertise. Optimizations such as data partitioning can have a large impact on performance since they directly affect data reuse patterns and the ability to reuse hardware. However, optimizing partitioning is a difficult process since minor changes in the parameter choices can lead to totally unpredictable performance. Functional array-based languages have been proposed instead of C-based approaches, as they offer stronger performance guarantees. This article proposes to follow a similar approach and exposes a divide-and-conquer primitive at the algorithmic level to let users partition any arbitrary computation. The compiler is then free to explore different partition shapes to maximize both data and hardware reuse automatically. The main challenge remains that the impact of partitioning is only known much later in the compilation flow. This is due to the hard-to-predict effects of the many optimizations applied during compilation. To solve this problem, the partitioning is expressed using a set of symbolic tunable parameters, introduced early in the compilation pipeline. A symbolic performance model is then used in the last compilation stage to predict performance based on the possible values of the tunable parameters. Using this approach, a design space exploration is conducted on an Intel Arria 10 Field Programmable Gate Arrays (FPGAs), and competitive performance is achieved on the classical VGG and TinyYolo neural networks.

2024-12-31

ACM Trans. Archit. Code Optim. (publié)

doi.org

Meta-learning how to Share Credit among Macro-Actions

Ionel-Alexandru Hosu

Traian Rebedea

Razvan Pascanu

One proposed mechanism to improve exploration in reinforcement learning is through the use of macro-actions. Paradoxically though, in many s… (voir plus)cenarios the naive addition of macro-actions does not lead to better exploration, but rather the opposite. It has been argued that this was caused by adding non-useful macros and multiple works have focused on mechanisms to discover effectively environment-specific useful macros. In this work, we take a slightly different perspective. We argue that the difficulty stems from the trade-offs between reducing the average number of decisions per episode versus increasing the size of the action space. Namely, one typically treats each potential macro-action as independent and atomic, hence strictly increasing the search space and making typical exploration strategies inefficient. To address this problem we propose a novel regularization term that exploits the relationship between actions and macro-actions to improve the credit assignment mechanism by reducing the effective dimension of the action space and, therefore, improving exploration. The term relies on a similarity matrix that is meta-learned jointly with learning the desired policy. We empirically validate our strategy looking at macro-actions in Atari games, and the StreetFighter II environment. Our results show significant improvements over the Rainbow-DQN baseline in all environments. Additionally, we show that the macro-action similarity is transferable to related environments. We believe this work is a small but important step towards understanding how the similarity-imposed geometry on the action space can be exploited to improve credit assignment and exploration, therefore making learning more effective.

2024-12-31

arXiv (prépublication)

doi.org

openreview.net

Min-Max Optimisation for Nonconvex-Nonconcave Functions Using a Random Zeroth-Order Extragradient Algorithm

Amir Ali Farzin

Yuen-Man Pun

Philipp Braun

Antoine Lesage-Landry

Youssef Diouane

Iman Shames

2024-12-31

Trans. Mach. Learn. Res. (publié)

doi.org

arxiv.org

Mixed-Integer Second-Order Cone Programming for Multi-period Scheduling of Flexible AC Transmission System Devices

Mohamad Charara

Martin De Montigny

Nivine Abou Daher

Hanane Dagdougui

Antoine Lesage-Landry

With the increasing energy demand and the growing integration of renewable sources of energy, power systems face operational challenges such… (voir plus) as overloads, losses, and stability concerns, particularly as networks operate near their capacity limits. Flexible alternating current transmission system (FACTS) devices are essential to ensure reliable grid operations and enable the efficient integration of renewable energy. This work introduces a mixed-integer second-order cone programming (MISOCP) model for the multi-period scheduling of key FACTS devices in electric transmission systems. The proposed model integrates four key control mechanisms: (i) on-load tap changers (OLTCs) for voltage regulation via discrete taps; (ii) static synchronous compensators (STATCOMs) and (iii) shunt reactors for reactive power compensation; and (iv) thyristor-controlled series capacitors (TCSCs) for adjustable impedance and flow control. The objective is to minimize active power losses using a limited number of control actions while meeting physical and operational constraints at all times throughout the defined time horizon. To ensure tractability, the model employs a second-order cone relaxation of the power flow. Device-specific constraints are handled via binary expansion and linearization: OLTCs and shunt reactors are modelled with discrete variables, STATCOMs through reactive power bounds, and TCSCs using a reformulation-linearization technique (RLT). A multi-period formulation captures the sequential nature of decision making, ensuring consistency across time steps. The model is evaluated on the IEEE 9-bus, 30-bus, and RTS96 test systems, demonstrating its ability to reduce losses, with potential applicability to larger-scale grids.

2024-12-31

arXiv (prépublication)

doi.org

arxiv.org

La plateforme Mila Ventures

Mila sur Udemy

Publications du Fellowship en politiques de l'IA

Publications

La plateforme Mila Ventures

Mila sur Udemy

Publications du Fellowship en politiques de l'IA

Mots-clés populaires:

Publications