Irina Rish

Biographie

Irina Rish est professeure titulaire à l'Université de Montréal (UdeM), où elle dirige le Laboratoire d'IA autonome. Membre du corps professoral de Mila – Institut québécois d’intelligence artificielle, elle est titulaire d'une chaire d'excellence en recherche du Canada (CERC) et d'une chaire en IA Canada-CIFAR. Irina dirige le projet INCITE du ministère américain de l'Environnement au sujet des modèles de fondation évolutifs sur les superordinateurs Summit et Frontier à l'Oak Ridge Leadership Computing Facility (OLCF). Elle est cofondatrice et directrice scientifique de Nolano.ai.

Ses recherches actuelles portent sur les lois de mise à l'échelle neuronale et les comportements émergents (capacités et alignement) dans les modèles de fondation, ainsi que sur l'apprentissage continu, la généralisation hors distribution et la robustesse. Avant de se joindre à l'UdeM en 2019, Irina était chercheuse au Centre de recherche IBM Thomas J. Watson, où elle a travaillé sur divers projets à l'intersection des neurosciences et de l'IA, et dirigé le défi NeuroAI. Elle a reçu plusieurs prix IBM : ceux de l’excellence et de l’innovation exceptionnelle (2018), celui de la réalisation technique exceptionnelle (2017), et celui de l’accomplissement en recherche (2009). Elle détient 64 brevets et a écrit plus de 120 articles de recherche, plusieurs chapitres de livres, trois livres publiés et une monographie sur la modélisation éparse.

Étudiants actuels

George Adamopoulos

Stagiaire de recherche

Ivan Anokhin

Doctorat - UdeM

Co-superviseur⋅e :

Samira Ebrahimi Kahou

Doctorat - UdeM

Arjun Ashok

Doctorat - UdeM

Co-superviseur⋅e :

Maîtrise recherche - UdeM

Doctorat - McGill

Superviseur⋅e principal⋅e :

Blake Richards

Mohammad Javad Darvishi Bayazi

Amin Darabi

Doctorat - UdeM

Collaborateur·rice de recherche - UdeM

Wagner Drew

Maîtrise recherche - Concordia

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Visiteur de recherche indépendant - -

Nadhir Hassen

Collaborateur·rice alumni - UdeM

Maîtrise recherche

Collaborateur·rice alumni - UdeM

Superviseur⋅e principal⋅e :

Ioannis Mitliagkas

Nizar Islah

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Doctorat - UdeM

Maîtrise recherche - Concordia

Superviseur⋅e principal⋅e :

Maîtrise recherche - UdeM

Neeraj Kumar

Collaborateur·rice alumni - UdeM

Gwen Legate

Doctorat - Concordia

Superviseur⋅e principal⋅e :

Eugene Belilovsky

David Lemay

Maîtrise recherche - UdeM

Jonathan Lim

Collaborateur·rice de recherche

Maîtrise recherche - UdeM

Collaborateur·rice de recherche

Doctorat - UdeM

Collaborateur·rice de recherche - UdeM

Gabriela Moisescu-Pareja

Collaborateur·rice de recherche - McGill

Superviseur⋅e principal⋅e :

Doina Precup

Timothy Nest

Doctorat - UdeM

Co-superviseur⋅e :

Eilif B. Muller

Mohammad Pezeshki

Collaborateur·rice de recherche

Co-superviseur⋅e :

Doctorat - McGill

Superviseur⋅e principal⋅e :

Pouya Bashivan

Mahta Ramezanian

Maîtrise recherche - UdeM

Co-superviseur⋅e :

Guillaume Dumas

Matthew Riemer

Doctorat - UdeM

Alexis Roger

Doctorat - McGill

Superviseur⋅e principal⋅e :

Blake Richards

Munish Sathish Kumar

Collaborateur·rice de recherche

Vaibhav Singh

Doctorat - Concordia

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Doctorat - UdeM

Co-superviseur⋅e :

Collaborateur·rice alumni - UdeM

Doctorat - UdeM

Co-superviseur⋅e :

Maîtrise recherche - UdeM

He Zhu

Doctorat - McGill

Publications

Context is Key: A Benchmark for Forecasting with Essential Textual Information

Andrew Robert Williams

Arjun Ashok

Étienne Marcotte

Valentina Zantedeschi

Jithendaraa Subramanian

Alexandre Lacoste

Forecasting is a critical task in decision making across various domains. While numerical data provides a foundation, it often lacks crucial… (voir plus) context necessary for accurate predictions. Human forecasters frequently rely on additional information, such as background knowledge or constraints, which can be efficiently communicated through natural language. However, the ability of existing forecasting models to effectively integrate this textual information remains an open question. To address this, we introduce"Context is Key"(CiK), a time series forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context, requiring models to integrate both modalities. We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters, and propose a simple yet effective LLM prompting method that outperforms all other tested methods on our benchmark. Our experiments highlight the importance of incorporating contextual information, demonstrate surprising performance when using LLM-based forecasting models, and also reveal some of their critical shortcomings. By presenting this benchmark, we aim to advance multimodal forecasting, promoting models that are both accurate and accessible to decision-makers with varied technical expertise. The benchmark can be visualized at https://servicenow.github.io/context-is-key-forecasting/v0/ .

2024-10-24

ArXiv (prépublication)

doi.org

arxiv.org

$\mu$LO: Compute-Efficient Meta-Generalization of Learned Optimizers

Benjamin Therien

Charles-Etienne Joseph

2024-10-10

NeurIPS.cc/2024/Workshop/OPT (publié)

doi.org

Mohammad Javad Darvishi Bayazi

Introducing Brain Foundation Models

Hena Ghonia

Roland Riachi

Bruno Aristimunha

Arian Khorasani

Md Rifat Arefin

Sylvain Chevallier

Amin Darabi

Guillaume Dumas

Brain function represents one of the most complex systems driving our world. Decoding its signals poses significant challenges, particularly… (voir plus) due to the limited availability of data and the high cost of recordings. The existence of large hospital datasets and laboratory collections partially mitigates this issue. However, the lack of standardized recording protocols, varying numbers of channels, diverse setups, scenarios, and recording devices further complicate the task. This work addresses these challenges by introducing the Brain Foundation Model (BFM), a suite of open-source models trained on brain signals. These models serve as foundational tools for various types of time-series neuroimaging tasks. This work presents the first model of the BFM series, which is trained on electroencephalogram signal data. Our results demonstrate that BFM-EEG can generate signals more accurately than other models. Upon acceptance, we will release the model weights and pipeline.

2024-10-10

NeurIPS.cc/2024/Workshop/TSALM (publié)

Language model scaling laws and zero-sum learning

Andrei Mircea

Ekaterina Lobacheva

Supriyo Chakraborty

Nima Chitsazan

This work aims to understand how, in terms of training dynamics, scaling up language model size yields predictable loss improvements. We fin… (voir plus)d that these improvements can be tied back to loss deceleration, an abrupt transition in the rate of loss improvement, characterized by piece-wise linear behavior in log-log space. Notably, improvements from increased model size appear to be a result of (1) improving the loss at which this transition occurs; and (2) improving the rate of loss improvement after this transition. As an explanation for the mechanism underlying this transition (and the effect of model size on loss it mediates), we propose the zero-sum learning (ZSL) hypothesis. In ZSL, per-token gradients become systematically opposed, leading to degenerate training dynamics where the model can't improve loss on one token without harming it on another; bottlenecking the overall rate at which loss can improve. We find compelling evidence of ZSL, as well as unexpected results which shed light on other factors contributing to ZSL.

2024-10-10

NeurIPS.cc/2024/Workshop/SciForDL (poster)

LLMs and Personalities: Inconsistencies Across Scales

Tosato Tommaso

Mahmood Hegazy

This study investigates the application of human psychometric assessments to large language models (LLMs) to examine their consistency and m… (voir plus)alleability in exhibiting personality traits. We administered the Big Five Inventory (BFI) and the Eysenck Personality Questionnaire-Revised (EPQ-R) to various LLMs across different model sizes and persona prompts. Our results reveal substantial variability in responses due to question order shuffling, challenging the notion of a stable LLM "personality." Larger models demonstrated more consistent responses, while persona prompts significantly influenced trait scores. Notably, the assistant persona led to more predictable scaling, with larger models exhibiting more socially desirable and less variable traits. In contrast, non-conventional personas displayed unpredictable behaviors, sometimes extending personality trait scores beyond the typical human range. These findings have important implications for understanding LLM behavior under different conditions and reflect on the consequences of scaling.

2024-10-09

NeurIPS.cc/2024/Workshop/Behavioral_ML (présentation orale)

LLMs and Personalities: Inconsistencies Across Scales

Tosato Tommaso

Mahmood Hegazy

2024-10-09

NeurIPS.cc/2024/Workshop/Behavioral_ML (présentation orale)

RedPajama: an Open Dataset for Training Large Language Models

Maurice Weber

Daniel Y Fu

Quentin Gregory Anthony

Yonatan Oren

Shane Adams

Anton Alexandrov

Xiaozhong Lyu

Huu Nguyen

Xiaozhe Yao

Virginia Adams

Ben Athiwaratkun

Rahul Chalamala

Kezhen Chen

Max Ryabinin

Tri Dao

Percy Liang

Christopher Re

Ce Zhang

2024-09-26

NeurIPS.cc/2024/Datasets_and_Benchmarks_Track (spotlight)

doi.org

Using Unity to Help Solve Reinforcement Learning

Connor Brennan

Andrew Robert Williams

Omar G. Younis

Vedant Vyas

Daria Yasafova

Leveraging the depth and flexibility of XLand as well as the rapid prototyping features of the Unity engine, we present the United Unity Uni… (voir plus)verse — an open-source toolkit designed to accelerate the creation of innovative reinforcement learning environments. This toolkit includes a robust implementation of XLand 2.0 complemented by a user-friendly interface which allows users to modify the details of procedurally generated terrains and task rules with ease. Additionally, we provide a curated selection of terrains and rule sets, accompanied by implementations of reinforcement learning baselines to facilitate quick experimentation with novel architectural designs for adaptive agents. Furthermore, we illustrate how the United Unity Universe serves as a high-level language that enables researchers to develop diverse and endlessly variable 3D environments within a unified framework. This functionality establishes the United Unity Universe (U3) as an essential tool for advancing the field of reinforcement learning, especially in the development of adaptive and generalizable learning systems.

2024-09-26

NeurIPS.cc/2024/Datasets_and_Benchmarks_Track (poster)