Publications

Canadarm, Canadarm2, and Canadarm3: The Evolution of Canada's Iconic Robotic System and Its Impacts from Space Down to Earth

Yianni Hudon-Castillo

Jean-Christophe Lamanque

Marion Thénault

Katherine Zamudio-Turcotte

Sri Venkata Vathsala Musunuri

Auriane Thilloy

Olivier Leclair

Mohamed Amine Elforaici

Rafael Daigneault

Rachad Chazbek

Giovanni Beltrame

2024-10-13

58th IAA History of Astronautics Symposium (published)

doi.org

Local Linearity is All You Need (in Data-Driven Teleoperation)

Michael Przystupa

Gauthier Gidel

Matthew E. Taylor

Martin Jagersand

Justus Piater

Samuele Tosatto

One of the critical aspects of assistive robotics is to provide a control system of a high-dimensional robot from a low-dimensional user inp… (see more)ut (i.e. a 2D joystick). Data-driven teleoperation seeks to provide an intuitive user interface called an action map to map the low dimensional input to robot velocities from human demonstrations. Action maps are machine learning models trained on robotic demonstration data to map user input directly to desired movements as opposed to aspects of robot pose ("move to cup or pour content" vs. "move along x- or y-axis"). Many works have investigated nonlinear action maps with multi-layer perceptrons, but recent work suggests that local-linear neural approximations provide better control of the system. However, local linear models assume actions exist on a linear subspace and may not capture nuanced motions in training data. In this work, we hypothesize that local-linear neural networks are effective because they make the action map odd w.r.t. the user input, enhancing the intuitiveness of the controller. Based on this assumption, we propose two nonlinear means of encoding odd behavior that do not constrain the action map to a local linear function. However, our analysis reveals that these models effectively behave like local linear models for relevant mappings between user joysticks and robot movements. We support this claim in simulation, and show on a realworld use case that there is no statistical benefit of using non-linear maps, according to the users experience. These negative results suggest that further investigation into model architectures beyond local linear models may offer diminishing returns for improving user experience in data-driven teleoperation systems.

2024-10-13

2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (published)

doi.org

PhotoBot: Reference-Guided Interactive Photography via Natural Language

Oliver Limoyo

Jimmy Li

Dmitriy Rivkin

Jonathan Kelly

Gregory Dudek

We introduce PhotoBot, a framework for fully automated photo acquisition based on an interplay between high-level human language guidance an… (see more)d a robot photographer. We propose to communicate photography suggestions to the user via reference images that are selected from a curated gallery. We leverage a visual language model (VLM) and an object detector to characterize the reference images via textual descriptions and then use a large language model (LLM) to retrieve relevant reference images based on a user's language query through text-based reasoning. To correspond the reference image and the observed scene, we exploit pre-trained features from a vision transformer capable of capturing semantic similarity across marked appearance variations. Using these features, we compute pose adjustments for an RGB-D camera by solving a perspective-n-point (PnP) problem. We demonstrate our approach using a manipulator equipped with a wrist camera. Our user studies show that photos taken by PhotoBot are often more aesthetically pleasing than those taken by users themselves, as measured by human feedback. We also show that PhotoBot can generalize to other reference sources such as paintings.

2024-10-13

2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (published)

doi.org

arxiv.org

The Canadian VirusSeq Data Portal and Duotang: open resources for SARS-CoV-2 viral sequences and genomic epidemiology

Erin E. Gill

Baofeng Jia

Carmen Lia Murall

Raphaël Poujol

Muhammad Zohaib Anwar

Nithu Sara John

Justin Richardsson

Ashley E. Hobb

Abayomi S. Olabode

Alexandru Lepsa

Ana T. Duggan

Andrea D. Tyler

Arnaud N’Guessan

Atul Kachru

Brandon Chan

Catherine Yoshida

Christina K. Yung

David Bujold

Dusan Andric

Edmund Su … (see 47 more)

Emma Griffiths

Gary Van Domselaar

Gordon Jolly

Heather Ward

Henrich Feher

Jared Baker

Jared T. Simpson

Jaser Uddin

Jiannis Ragoussis

Jon Eubank

Jörg H. Fritz

José Héctor Gálvez

Karen Fang

Kim Cullion

Leonardo Landa Rivera

Qian Xiang

Matthew A. Croxen

Mitchell Shiell

Natalie Prystajecky

Pierre-Olivier Quirion

Rosita Bajari

Samantha Rich

Samira Mubareka

Sandrine Moreira

Scott Cain

Steven G. Sutcliffe

Susanne A. Kraemer

Yelizar Alturmessov

Yann Joly

VirusSeq Data Portal Academic and Health Network**

Marc Fiume

Terrance P. Snutch

Cindy Bell

Catalina López-Correa

Julie Hussin

Jeffrey B. Joy

Caroline Colijn

Paul M. K. Gordon

William Hsiao

Art F. Y. Poon

Natalie Knox

Mélanie Courtot

Lincoln Stein

Sarah P. Otto

Guillaume Bourque

B. Jesse Shapiro

Fiona S. L. Brinkman

The COVID-19 pandemic led to a large global effort to sequence SARS-CoV-2 genomes from patient samples to track viral evolution and inform t… (see more)he public health response. Millions of SARS-CoV-2 genome sequences have been deposited in global public repositories. The Canadian COVID-19 Genomics Network (CanCOGeN – VirusSeq), a consortium tasked with coordinating expanded sequencing of SARS-CoV-2 genomes across Canada early in the pandemic, created the Canadian VirusSeq Data Portal, with associated data pipelines and procedures, to support these efforts. The goal of VirusSeq was to allow open access to Canadian SARS-CoV-2 genomic sequences and enhanced, standardized contextual data that were unavailable in other repositories and that meet FAIR standards (Findable, Accessible, Interoperable and Reusable). In addition, the portal data submission pipeline contains data quality checking procedures and appropriate acknowledgement of data generators that encourages collaboration. From inception to execution, the portal was developed with a conscientious focus on strong data governance principles and practices. Extensive efforts ensured a commitment to Canadian privacy laws, data security standards, and organizational processes. This portal has been coupled with other resources, such as Viral AI, and was further leveraged by the Coronavirus Variants Rapid Response Network (CoVaRR-Net) to produce a suite of continually updated analytical tools and notebooks. Here we highlight this portal (https://virusseq-dataportal.ca/), including its contextual data not available elsewhere, and the Duotang (https://covarr-net.github.io/duotang/duotang.html), a web platform that presents key genomic epidemiology and modelling analyses on circulating and emerging SARS-CoV-2 variants in Canada. Duotang presents dynamic changes in variant composition of SARS-CoV-2 in Canada and by province, estimates variant growth, and displays complementary interactive visualizations, with a text overview of the current situation. The VirusSeq Data Portal and Duotang resources, alongside additional analyses and resources computed from the portal (COVID-MVP, CoVizu), are all open source and freely available. Together, they provide an updated picture of SARS-CoV-2 evolution to spur scientific discussions, inform public discourse, and support communication with and within public health authorities. They also serve as a framework for other jurisdictions interested in open, collaborative sequence data sharing and analyses.

2024-10-13

Microbial Genomics (published)

doi.org

Working Backwards: Learning to Place by Picking

Oliver Limoyo

Abhisek Konar

Trevor Ablett

Jonathan Kelly

Francois Hogan

Gregory Dudek

2024-10-13

2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (published)

doi.org

arxiv.org

Dynamic Abstractions: Building the Next Generation of Cognitive Tools and Interfaces

Sangho Suh

Hai Dang

Ryan Yen

Josh M. Pollock

Ian Arawjo

Rubaiat Habib Kazi

Hariharan Subramonyam

Jingyi Li

Nazmus Saquib

Arvind Satyanarayan

2024-10-12

The 37th Annual ACM Symposium on User Interface Software and Technology (published)

doi.org

Effective Protein-Protein Interaction Exploration with PPIretrieval

Chenqing Hua

Connor W. Coley

Guy Wolf

Doina Precup

Shuangjia Zheng

2024-10-12

NeurIPS.cc/2024/Workshop/AIDrugX (poster)

doi.org

openreview.net

EnzymeFlow: Generating Reaction-specific Enzyme Catalytic Pockets through Flow Matching and Co-Evolutionary Dynamics

Yang Liu

Odin Zhang

Kevin K Yang

Shuangjia Zheng

2024-10-12

NeurIPS.cc/2024/Workshop/AIDrugX (poster)

doi.org

openreview.net

Molphenix: A Multimodal Foundation Model for PhenoMolecular Retrieval

Philip Fradkin

Puria Azadi Moghadam

Karush Suri

Frederik Wenkel

Maciej Sypetkowski

Dominique Beaini

Predicting molecular impact on cellular function is a core challenge in therapeutic design. Phenomic experiments, designed to capture cellu… (see more)lar morphology, utilize microscopy based techniques and demonstrate a high throughput solution for uncovering molecular impact on the cell. In this work, we learn a joint latent space between molecular structures and microscopy phenomic experiments, aligning paired samples with contrastive learning. Specifically, we study the problem of Contrastive PhenoMolecular Retrieval, which consists of zero-shot molecular structure identification conditioned on phenomic experiments. We assess challenges in multi-modal learning of phenomics and molecular modalities such as experimental batch effect, inactive molecule perturbations, and encoding perturbation concentration. We demonstrate improved multi-modal learner retrieval through (1) a uni-modal pre-trained phenomics model, (2) a novel inter sample similarity aware loss, and (3) models conditioned on a representation of molecular concentration. Following this recipe, we propose MolPhenix, a molecular phenomics model. MolPhenix leverages a pre-trained phenomics model to demonstrate significant performance gains across perturbation concentrations, molecular scaffolds, and activity thresholds. In particular, we demonstrate an 8.1

2024-10-12

NeurIPS.cc/2024/Workshop/AIDrugX (poster)

openreview.net

Neurospectrum: A Geometric and Topological Deep Learning Framework for Uncovering Spatiotemporal Signatures in Neural Activity

Dhananjay Bhaskar

Yanlei Zhang

Jessica Moore

Feng Gao

Bastian Rieck

Firas Khasawneh

Elizabeth Munch

Guy Wolf

Valentina Greco

Smita Krishnaswamy

J. Adam Noah

Helen Pushkarskaya

Christopher Pittenger

Neural signals are high-dimensional, noisy, and dynamic, making it challenging to extract interpretable features linked to behavior or disea… (see more)se. We introduce Neurospectrum , a framework that encodes neural activity as latent trajectories shaped by spatial and temporal structure. At each timepoint, signals are represented on a graph capturing spatial relationships, with a learnable attention mechanism highlighting important regions. These are embedded using graph wavelets and passed through a manifold-regularized autoencoder that preserves temporal geometry. The resulting latent trajectory is summarized using a principled set of descriptors - including curvature, path signatures, persistent homology, and recurrent networks -that capture multiscale geometric, topological, and dynamical features. These features drive downstream prediction in a modular, interpretable, and end-to-end trainable framework. We evaluate Neurospectrum on simulated and experimental datasets. It tracks phase synchronization in Kuramoto simulations, reconstructs visual stimuli from calcium imaging, and identifies biomarkers of obsessive-compulsive disorder in fMRI. Across tasks, Neurospectrum uncovers meaningful neural dynamics and outperforms traditional analysis methods.

2024-10-12

bioRxiv (preprint)

doi.org

Can Safety Fine-Tuning Be More Principled? Lessons Learned from Cybersecurity

David Williams-King

Linh Le

Adam Oberman

Yoshua Bengio

As LLMs develop increasingly advanced capabilities, there is an increased need to minimize the harm that could be caused to society by certa… (see more)in model outputs; hence, most LLMs have safety guardrails added, for example via fine-tuning. In this paper, we argue the position that current safety fine-tuning is very similar to a traditional cat-and-mouse game (or arms race) between attackers and defenders in cybersecurity. Model jailbreaks and attacks are patched with bandaids to target the specific attack mechanism, but many similar attack vectors might remain. When defenders are not proactively coming up with principled mechanisms, it becomes very easy for attackers to sidestep any new defenses. We show how current defenses are insufficient to prevent new adversarial jailbreak attacks, reward hacking, and loss of control problems. In order to learn from past mistakes in cybersecurity, we draw analogies with historical examples and develop lessons learned that can be applied to LLM safety. These arguments support the need for new and more principled approaches to designing safe models, which are architected for security from the beginning. We describe several such approaches from the AI literature.

2024-10-11

NeurIPS.cc/2024/Workshop/SafeGenAi (poster)

doi.org

openreview.net

Epistemic Integrity in Large Language Models

Bijean Ghafouri

Shahrad Mohammadzadeh

James Zhou

Pratheeksha Nair

Jacob-Junqi Tian

Mayank Goel

Reihaneh Rabbany

Jean-François Godbout

Kellin Pelrine

Large language models are increasingly relied upon as sources of information, but their propensity for generating false or misleading statem… (see more)ents with high confidence poses risks for users and society. In this paper, we confront the critical problem of epistemic miscalibration—where a model's linguistic assertiveness fails to reflect its true internal certainty. We introduce a new human-labeled dataset and a novel method for measuring the linguistic assertiveness of Large Language Models which cuts error rates by over 50% relative to previous benchmarks. Validated across multiple datasets, our method reveals a stark misalignment between how confidently models linguistically present information and their actual accuracy. Further human evaluations confirm the severity of this miscalibration. This evidence underscores the urgent risk of the overstated certainty Large Language Models hold which may mislead users on a massive scale. Our framework provides a crucial step forward in diagnosing and correcting this miscalibration, offering a path to safer and more trustworthy AI across domains.

2024-10-11

NeurIPS.cc/2024/Workshop/SafeGenAi (poster)

doi.org

openreview.net

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications