Publications

What Mechanisms Does Knowledge Distillation Distill?

Cindy Wu

Ekdeep Singh Lubana

Bruno Mlodozeniec

Robert Kirk

Knowledge distillation is a commonly-used compression method in ML due to the popularity of increasingly large-scale models, but it is uncle… (voir plus)ar if all the information a teacher model contains is distilled into the smaller student model. We aim to formalize the concept of ‘knowledge’ to investigate how knowledge is transferred during distillation, focusing on shared invariant outputs to counterfactual changes of dataset latent variables (we call these latents mechanisms). We define a student model to be a good stand-in model for a teacher if it shares the teacher’s learned mechanisms, and find that Jacobian matching and contrastive representation learning are viable methods by which to train such models. While these methods do not result in perfect transfer of mechanisms, we show they often improve student fidelity or mitigate simplicity bias (as measured by the teacher-to-student KL divergence and accuracy on various out-of-distribution test datasets), especially on datasets with spurious statistical correlations.

2024-05-14

Proceedings of UniReps: the First Workshop on Unifying Representations in Neural Models (publié)

proceedings.mlr.press

openreview.net

CARTIER: Cartographic lAnguage Reasoning Targeted at Instruction Execution for Robots

Nikhil Kakodkar

Dmitriy Rivkin

Bobak H. Baghi

Francois Hogan

Gregory Dudek

2024-05-13

2024 IEEE International Conference on Robotics and Automation (ICRA) (publié)

doi.org

arxiv.org

ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning

Qiao Gu

Alihusein Kuwajerwala

Sacha Morin

Krishna Murthy

Bipasha Sen

Aditya Agarwal

Corban Rivera

William Paul

Kirsty Ellis

Rama Chellappa

Chuang Gan

Celso M de Melo

Joshua B. Tenenbaum

Antonio Torralba

Florian Shkurti

Liam Paull

For robots to perform a wide variety of tasks, they require a 3D representation of the world that is semantically rich, yet compact and effi… (voir plus)cient for task-driven perception and planning. Recent approaches have attempted to leverage features from large vision-language models to encode semantics in 3D representations. However, these approaches tend to produce maps with per-point feature vectors, which do not scale well in larger environments, nor do they contain semantic spatial relationships between entities in the environment, which are useful for downstream planning. In this work, we propose ConceptGraphs, an open-vocabulary graph-structured representation for 3D scenes. ConceptGraphs is built by leveraging 2D foundation models and fusing their output to 3D by multi-view association. The resulting representations generalize to novel semantic classes, without the need to collect large 3D datasets or finetune models. We demonstrate the utility of this representation through a number of downstream planning tasks that are specified through abstract (language) prompts and require complex reasoning over spatial and semantic concepts. (Project page: https://concept-graphs.github.io/ Explainer video: https://youtu.be/mRhNkQwRYnc )

2024-05-13

2024 IEEE International Conference on Robotics and Automation (ICRA) (publié)

doi.org

openreview.net

Divergent Creativity in Humans and Large Language Models

Antoine Bellemare-Pepin

Franccois Lespinasse

Philipp Thölke

Yann Harel

Kory Wallace Mathewson

Jay A. Olson

Yoshua Bengio

Karim Jerbi CoCo Lab

Psychology Department

U. Montr'eal

Montreal

Qc

Canada

Music department

C. University

Sociology

Anthropology department

Mila

Departmentof Psychology

University of Toronto Mississauga … (voir 5 de plus)

Mississauga

On

Department of Computer Science

Operations Research

Unique Center

The recent surge in the capabilities of Large Language Models (LLMs) has led to claims that they are approaching a level of creativity akin … (voir plus)to human capabilities. This idea has sparked a blend of excitement and apprehension. However, a critical piece that has been missing in this discourse is a systematic evaluation of LLM creativity, particularly in comparison to human divergent thinking. To bridge this gap, we leverage recent advances in creativity science to build a framework for in-depth analysis of divergent creativity in both state-of-the-art LLMs and a substantial dataset of 100,000 humans. We found evidence suggesting that LLMs can indeed surpass human capabilities in specific creative tasks such as divergent association and creative writing. Our quantitative benchmarking framework opens up new paths for the development of more creative LLMs, but it also encourages more granular inquiries into the distinctive elements that constitute human inventive thought processes, compared to those that can be artificially generated.

2024-05-13

ArXiv (prépublication)

doi.org

arxiv.org

Divergent Creativity in Humans and Large Language Models

Antoine Bellemare-Pepin

Franccois Lespinasse

Philipp Thölke

Yann Harel

Kory Wallace Mathewson

Jay A. Olson

Yoshua Bengio

Karim Jerbi CoCo Lab

Psychology Department

U. Montr'eal

Montreal

Qc

Canada

Music department

C. University

Sociology

Anthropology department

Mila

Departmentof Psychology

University of Toronto Mississauga … (voir 5 de plus)

Mississauga

On

Department of Computer Science

Operations Research

Unique Center

The recent surge in the capabilities of Large Language Models (LLMs) has led to claims that they are approaching a level of creativity akin … (voir plus)to human capabilities. This idea has sparked a blend of excitement and apprehension. However, a critical piece that has been missing in this discourse is a systematic evaluation of LLM creativity, particularly in comparison to human divergent thinking. To bridge this gap, we leverage recent advances in creativity science to build a framework for in-depth analysis of divergent creativity in both state-of-the-art LLMs and a substantial dataset of 100,000 humans. We found evidence suggesting that LLMs can indeed surpass human capabilities in specific creative tasks such as divergent association and creative writing. Our quantitative benchmarking framework opens up new paths for the development of more creative LLMs, but it also encourages more granular inquiries into the distinctive elements that constitute human inventive thought processes, compared to those that can be artificially generated.

2024-05-13

ArXiv (prépublication)

doi.org

arxiv.org

Divergent Creativity in Humans and Large Language Models

Antoine Bellemare‐Pepin

Franccois Lespinasse

Philipp Thölke

Yann Harel

Kory Wallace Mathewson

Jay A. Olson

Yoshua Bengio

Karim Jerbi CoCo Lab

Psychology Department

U. Montr'eal

Montreal

Qc

Canada

Music department

C. University

Sociology

Anthropology department

Mila

Departmentof Psychology

University of Toronto Mississauga … (voir 5 de plus)

Mississauga

On

Department of Computer Science

Operations Research

Unique Center

The recent surge in the capabilities of Large Language Models (LLMs) has led to claims that they are approaching a level of creativity akin … (voir plus)to human capabilities. This idea has sparked a blend of excitement and apprehension. However, a critical piece that has been missing in this discourse is a systematic evaluation of LLM creativity, particularly in comparison to human divergent thinking. To bridge this gap, we leverage recent advances in creativity science to build a framework for in-depth analysis of divergent creativity in both state-of-the-art LLMs and a substantial dataset of 100,000 humans. We found evidence suggesting that LLMs can indeed surpass human capabilities in specific creative tasks such as divergent association and creative writing. Our quantitative benchmarking framework opens up new paths for the development of more creative LLMs, but it also encourages more granular inquiries into the distinctive elements that constitute human inventive thought processes, compared to those that can be artificially generated.

2024-05-13

ArXiv (prépublication)

doi.org

arxiv.org

Divergent Creativity in Humans and Large Language Models

Antoine Bellemare‐Pepin

Franccois Lespinasse

Philipp Thölke

Yann Harel

Kory Wallace Mathewson

Jay A. Olson

Yoshua Bengio

Karim Jerbi CoCo Lab

Psychology Department

U. Montr'eal

Montreal.

Qc

Canada

Music department

C. University

Sociology

Anthropology department

Mila

Departmentof Psychology

University of Toronto Mississauga … (voir 5 de plus)

Mississauga

On

Department of Computer Science

Operations Research

Unique Center

The recent surge of Large Language Models (LLMs) has led to claims that they are approaching a level of creativity akin to human capabilitie… (voir plus)s. This idea has sparked a blend of excitement and apprehension. However, a critical piece that has been missing in this discourse is a systematic evaluation of LLMs'semantic diversity, particularly in comparison to human divergent thinking. To bridge this gap, we leverage recent advances in computational creativity to analyze semantic divergence in both state-of-the-art LLMs and a substantial dataset of 100,000 humans. We found evidence that LLMs can surpass average human performance on the Divergent Association Task, and approach human creative writing abilities, though they fall short of the typical performance of highly creative humans. Notably, even the top performing LLMs are still largely surpassed by highly creative individuals, underscoring a ceiling that current LLMs still fail to surpass. Our human-machine benchmarking framework addresses the polemic surrounding the imminent replacement of human creative labour by AI, disentangling the quality of the respective creative linguistic outputs using established objective measures. While prompting deeper exploration of the distinctive elements of human inventive thought compared to those of AI systems, we lay out a series of techniques to improve their outputs with respect to semantic diversity, such as prompt design and hyper-parameter tuning.

2024-05-13

ArXiv (prépublication)

doi.org

arxiv.org

Divergent Creativity in Humans and Large Language Models

Antoine Bellemare‐Pepin

Franccois Lespinasse

Philipp Thölke

Yann Harel

Kory Wallace Mathewson

Jay A. Olson

Yoshua Bengio

Karim Jerbi CoCo Lab

Psychology Department

U. Montr'eal

Montreal.

Qc

Canada

Music department

C. University

Sociology

Anthropology department

Mila

Departmentof Psychology

University of Toronto Mississauga … (voir 5 de plus)

Mississauga

On

Department of Computer Science

Operations Research

Unique Center

The recent surge of Large Language Models (LLMs) has led to claims that they are approaching a level of creativity akin to human capabilitie… (voir plus)s. This idea has sparked a blend of excitement and apprehension. However, a critical piece that has been missing in this discourse is a systematic evaluation of LLMs'semantic diversity, particularly in comparison to human divergent thinking. To bridge this gap, we leverage recent advances in computational creativity to analyze semantic divergence in both state-of-the-art LLMs and a substantial dataset of 100,000 humans. We found evidence that LLMs can surpass average human performance on the Divergent Association Task, and approach human creative writing abilities, though they fall short of the typical performance of highly creative humans. Notably, even the top performing LLMs are still largely surpassed by highly creative individuals, underscoring a ceiling that current LLMs still fail to surpass. Our human-machine benchmarking framework addresses the polemic surrounding the imminent replacement of human creative labour by AI, disentangling the quality of the respective creative linguistic outputs using established objective measures. While prompting deeper exploration of the distinctive elements of human inventive thought compared to those of AI systems, we lay out a series of techniques to improve their outputs with respect to semantic diversity, such as prompt design and hyper-parameter tuning.

2024-05-13

ArXiv (prépublication)

doi.org

arxiv.org

GAGE: Genetic Algorithm-Based Graph Explainer for Malware Analysis

Mohd Saqib

Benjamin Fung

Philippe Charland

Andrew Walenstein

Malware analysts often prefer reverse engineering using Call Graphs, Control Flow Graphs (CFGs), and Data Flow Graphs (DFGs), which involves… (voir plus) the utilization of black-box Deep Learning (DL) models. The proposed research introduces a structured pipeline for reverse engineering-based analysis, offering promising results compared to state-of-the-art methods and providing high-level interpretability for malicious code blocks in subgraphs. We propose the Canonical Executable Graph (CEG) as a new representation of Portable Executable (PE) files, uniquely incorporating syntactical and semantic information into its node embeddings. At the same time, edge features capture structural aspects of PE files. This is the first work to present a PE file representation encompassing syntactical, semantic, and structural characteristics, whereas previous efforts typically focused solely on syntactic or structural properties. Furthermore, recognizing the limitations of existing graph explanation methods within Explainable Artificial Intelligence (XAI) for malware analysis, primarily due to the specificity of malicious files, we introduce Genetic Algorithm-based Graph Explainer (GAGE). GAGE operates on the CEG, striving to identify a precise subgraph relevant to predicted malware families. Through experiments and comparisons, our proposed pipeline exhibits substantial improvements in model robustness scores and discriminative power compared to the previous benchmarks. Furthermore, we have successfully used GAGE in practical applications on real-world data, producing meaningful insights and interpretability. This research offers a robust solution to enhance cybersecurity by delivering a transparent and accurate understanding of malware behaviour. Moreover, the proposed algorithm is specialized in handling graph-based data, effectively dissecting complex content and isolating influential nodes.

2024-05-13

IEEE International Conference on Data Engineering (publié)

doi.org

Globally Stable Neural Imitation Policies

Amin Abyaneh

Mariana Sosa Guzmán

Hsiu-Chin Lin

2024-05-13

2024 IEEE International Conference on Robotics and Automation (ICRA) (publié)

doi.org

arxiv.org

A Neural-Evolutionary Algorithm for Autonomous Transit Network Design

Andrew Holliday

Gregory Dudek

2024-05-13

2024 IEEE International Conference on Robotics and Automation (ICRA) (publié)

doi.org

arxiv.org

Open Source in Lab Management

Julien Cohen-Adad

This document explores the advantages of integrating open source software and practices in managing a scientific lab, emphasizing reproducib… (voir plus)ility and the avoidance of pitfalls. It details practical applications from website management using GitHub Pages to organizing datasets in compliance with BIDS standards, highlights the importance of continuous testing for data integrity, IT management through Ansible for efficient system configuration, open source software development. The broader goal is to promote transparent, reproducible science by adopting open source tools. This approach not only saves time but exposes students to best practices, enhancing the transparency and reproducibility of scientific research.

2024-05-13

ArXiv (prépublication)

doi.org

arxiv.org

Hackathon | Créer une IA plus sécuritaire pour la santé mentale des jeunes

Communauté de pratique de Mila : Sécurité en IA

Éclaireurs autochtones en IA

Avantage IA

Publications

Hackathon | Créer une IA plus sécuritaire pour la santé mentale des jeunes

Communauté de pratique de Mila : Sécurité en IA

Éclaireurs autochtones en IA

Avantage IA

Mots-clés populaires:

Publications