Publications

Strong Consistency and Rate of Convergence of Switched Least Squares System Identification for Autonomous Markov Jump Linear Systems

Borna Sayedana

Mohammad Afshari

Peter E. Caines

Aditya Mahajan

In this paper, we investigate the problem of system identification for autonomous Markov jump linear systems (MJS) with complete state obser… (see more)vations. We propose switched least squares method for identification of MJS, show that this method is strongly consistent, and derive data-dependent and data-independent rates of convergence. In particular, our data-independent rate of convergence shows that, almost surely, the system identification error is

2024-01-01

IEEE Transactions on Automatic Control (published)

doi.org

arxiv.org

Structured Inverse-Free Natural Gradient Descent: Memory-Efficient & Numerically-Stable KFAC

Wu Lin

Felix Dangel

Runa Eschenhagen

Kirill Neklyudov

Agustinus Kristiadi

Richard E. Turner

Alireza Makhzani

Second-order methods such as KFAC can be useful for neural net training. However, they are often memory-inefficient since their precondition… (see more)ing Kronecker factors are dense, and numerically unstable in low precision as they require matrix inversion or decomposition. These limitations render such methods unpopular for modern mixed-precision training. We address them by (i) formulating an inverse-free KFAC update and (ii) imposing structures in the Kronecker factors, resulting in structured inverse-free natural gradient descent (SINGD). On modern neural networks, we show that SINGD is memory-efficient and numerically robust, in contrast to KFAC, and often outperforms AdamW even in half precision. Our work closes a gap between first- and second-order methods in modern low-precision training.

2024-01-01

ICML (published)

proceedings.mlr.press

arxiv.org

No such thing as one-size-fits-all in AI ethics frameworks: a comparative case study

Vivian Qiang

Jimin Rhim

AJung Moon

2024-01-01

AI Soc. (published)

doi.org

Sufficient conditions for offline reactivation in recurrent neural networks

Nanda H Krishna

Colin Bredenberg

Daniel Levenstein

Blake Richards

Guillaume Lajoie

During periods of quiescence, such as sleep, neural activity in many brain circuits resembles that observed during periods of task engagemen… (see more)t. However, the precise conditions under which task-optimized networks can autonomously reactivate the same network states responsible for online behavior are poorly understood. In this study, we develop a mathematical framework that outlines sufficient conditions for the emergence of neural reactivation in circuits that encode features of smoothly varying stimuli. We demonstrate mathematically that noisy recurrent networks optimized to track environmental state variables using change-based sensory information naturally develop denoising dynamics, which, in the absence of input, cause the network to revisit state configurations observed during periods of online activity. We validate our findings using numerical experiments on two canonical neuroscience tasks: spatial position estimation based on self-motion cues, and head direction estimation based on angular velocity cues. Overall, our work provides theoretical support for modeling offline reactivation as an emergent consequence of task optimization in noisy neural circuits.

2024-01-01

International Conference on Learning Representations (published)

doi.org

openreview.net

Sum and Tensor of Quantitative Effects

Giorgio Bacci

Radu Mardare

Prakash Panangaden

Gordon Plotkin

2024-01-01

Log. Methods Comput. Sci. (published)

doi.org

arxiv.org

Survey on Explainable AI: Techniques, challenges and open issues

Adel Abusitta

Miles Q. Li

Benjamin Fung

2024-01-01

Expert Syst. Appl. (published)

doi.org

SynFlowNet: Towards Molecule Design with Guaranteed Synthesis Pathways

M. Cretu

Charles Harris

Julien Roy

Emmanuel Bengio

Pietro Lio

2024-01-01

arXiv.org (preprint)

doi.org

SynFlowNet: Towards Molecule Design with Guaranteed Synthesis Pathways

M. Cretu

Charles Harris

Julien Roy

Emmanuel Bengio

Pietro Lio

2024-01-01

arXiv.org (preprint)

doi.org

TARIC-SLU: A Tunisian Benchmark Dataset for Spoken Language Understanding

Salima Mdhaffar

Fethi Bougares

Renato De Mori

Salah Zaiem

Mirco Ravanelli

Yannick Estève

In recent years, there has been a significant increase in interest in developing Spoken Language Understanding (SLU) systems. SLU involves e… (see more)xtracting a list of semantic information from the speech signal. A major issue for SLU systems is the lack of sufficient amount of bi-modal (audio and textual semantic annotation) training data. Existing SLU resources are mainly available in high-resource languages such as English, Mandarin and French. However, one of the current challenges concerning low-resourced languages is data collection and annotation. In this work, we present a new freely available corpus, named TARIC-SLU, composed of railway transport conversations in Tunisian dialect that is continuously annotated in dialogue acts and slots. We describe the semantic model of the dataset, the data and experiments conducted to build ASR-based and SLU-based baseline models. To facilitate its use, a complete recipe, including data preparation, training and evaluation scripts, has been built and will be integrated to SpeechBrain, a popular open-source conversational AI toolkit based on PyTorch.

2024-01-01

International Conference on Language Resources and Evaluation (published)

dblp.uni-trier.de

Temporal Graph Analysis with TGX

Razieh Shirzadkhani

Shenyang Huang

Elahe Kooshafar

Reihaneh Rabbany

Farimah Poursafaei

Real-world networks, with their evolving relations, are best captured as temporal graphs. However, existing software libraries are largely d… (see more)esigned for static graphs where the dynamic nature of temporal graphs is ignored. Bridging this gap, we introduce TGX, a Python package specially designed for analysis of temporal networks that encompasses an automated pipeline for data loading, data processing, and analysis of evolving graphs. TGX provides access to eleven built-in datasets and eight external Temporal Graph Benchmark (TGB) datasets as well as any novel datasets in the .csv format. Beyond data loading, TGX facilitates data processing functionalities such as discretization of temporal graphs and node subsampling to accelerate working with larger datasets. For comprehensive investigation, TGX offers network analysis by providing a diverse set of measures, including average node degree and the evolving number of nodes and edges per timestamp. Additionally, the package consolidates meaningful visualization plots indicating the evolution of temporal patterns, such as Temporal Edge Appearance (TEA) and Temporal Edge Trafficc (TET) plots. The TGX package is a robust tool for examining the features of temporal graphs and can be used in various areas like studying social networks, citation networks, and tracking user interactions. We plan to continuously support and update TGX based on community feedback. TGX is publicly available on: https://github.com/ComplexData-MILA/TGX.

2024-01-01

WSDM (published)

doi.org

arxiv.org

On the consistency of hyper-parameter selection in value-based deep reinforcement learning

Johan Samir Obando Ceron

João Guilherme Madeira Araújo

Aaron Courville

Pablo Samuel Castro

Deep reinforcement learning (deep RL) has achieved tremendous success on various domains through a combination of algorithmic design and car… (see more)eful selection of hyper-parameters. Algorithmic improvements are often the result of iterative enhancements built upon prior approaches, while hyper-parameter choices are typically inherited from previous methods or fine-tuned specifically for the proposed technique. Despite their crucial impact on performance, hyper-parameter choices are frequently overshadowed by algorithmic advancements. This paper conducts an extensive empirical study focusing on the reliability of hyper-parameter selection for value-based deep reinforcement learning agents, including the introduction of a new score to quantify the consistency and reliability of various hyper-parameters. Our findings not only help establish which hyper-parameters are most critical to tune, but also help clarify which tunings remain consistent across different training regimes.

2024-01-01

RLJ (published)

doi.org

openreview.net

The Cost of Scaling Down Large Language Models: Reducing Model Size Affects Memory before In-context Learning.

Tian Jin

Nolan Clement

Xin Dong

Vaishnavh Nagarajan

Michael Carbin

Jonathan Ragan-Kelley

Gintare Karolina Dziugaite

2024-01-01

International Conference on Learning Representations (published)

openreview.net

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Publications

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Popular keywords:

Publications