Publications

Zero-Shot Object-Centric Representation Learning

Aniket Rajiv Didolkar

Andrii Zadaianchuk

Michael Curtis Mozer

Georg Martius

Maximilian Seitzer

The goal of object-centric representation learning is to decompose visual scenes into a structured representation that isolates the entities… (see more). Recent successes have shown that object-centric representation learning can be scaled to real-world scenes by utilizing pre-trained self-supervised features. However, so far, object-centric methods have mostly been applied in-distribution, with models trained and evaluated on the same dataset. This is in contrast to the wider trend in machine learning towards general-purpose models directly applicable to unseen data and tasks. Thus, in this work, we study current object-centric methods through the lens of zero-shot generalization by introducing a benchmark comprising eight different synthetic and real-world datasets. We analyze the factors influencing zero-shot performance and find that training on diverse real-world images improves transferability to unseen scenarios. Furthermore, inspired by the success of task-specific fine-tuning in foundation models, we introduce a novel fine-tuning strategy to adapt pre-trained vision encoders for the task of object discovery. We find that the proposed approach results in state-of-the-art performance for unsupervised object discovery, exhibiting strong zero-shot transfer to unseen datasets.

2024-08-16

ArXiv (preprint)

doi.org

arxiv.org

Multi Teacher Privileged Knowledge Distillation for Multimodal Expression Recognition

Muhammad Haseeb Aslam

Marco Pedersoli

Alessandro Lameiras Koerich

Eric Granger

Human emotion is a complex phenomenon conveyed and perceived through facial expressions, vocal tones, body language, and physiological signa… (see more)ls. Multimodal emotion recognition systems can perform well because they can learn complementary and redundant semantic information from diverse sensors. In real-world scenarios, only a subset of the modalities employed for training may be available at test time. Learning privileged information allows a model to exploit data from additional modalities that are only available during training. SOTA methods for PKD have been proposed to distill information from a teacher model (with privileged modalities) to a student model (without privileged modalities). However, such PKD methods utilize point-to-point matching and do not explicitly capture the relational information. Recently, methods have been proposed to distill the structural information. However, PKD methods based on structural similarity are primarily confined to learning from a single joint teacher representation, which limits their robustness, accuracy, and ability to learn from diverse multimodal sources. In this paper, a multi-teacher PKD (MT-PKDOT) method with self-distillation is introduced to align diverse teacher representations before distilling them to the student. MT-PKDOT employs a structural similarity KD mechanism based on a regularized optimal transport (OT) for distillation. The proposed MT-PKDOT method was validated on the Affwild2 and Biovid datasets. Results indicate that our proposed method can outperform SOTA PKD methods. It improves the visual-only baseline on Biovid data by 5.5%. On the Affwild2 dataset, the proposed method improves 3% and 5% over the visual-only baseline for valence and arousal respectively. Allowing the student to learn from multiple diverse sources is shown to increase the accuracy and implicitly avoids negative transfer to the student model.

2024-08-15

ArXiv (preprint)

doi.org

arxiv.org

What Secrets Do Your Manifolds Hold? Understanding the Local Geometry of Generative Models

Ahmed Imtiaz Humayun

Candice Schumann

2024-08-14

ArXiv (preprint)

doi.org

openreview.net

<scp>RF</scp> shimming in the cervical spinal cord at <scp>7 T</scp>

Daniel Papp

Kyle M. Gilbert

Gaspard Cereza

Alexandre D'Astous

Nibardo Lopez‐Rios

Mathieu Boudreau

Marcus J. Couch

Pedram Yazdanbakhsh

Robert L. Barry

Eva Alonso‐Ortiz

Julien Cohen‐Adad

The study's findings highlight the potential of RF shimming to advance 7 T MRI's clinical utility for central nervous system imaging by enab… (see more)ling more homogenous and efficient spinal cord imaging. Additionally, the research incorporates a reproducible Jupyter Notebook, enhancing the study's transparency and facilitating peer verification.

2024-08-12

Magnetic Resonance in Medicine (published)

doi.org

Unveiling the Flaws: A Critical Analysis of Initialization Effect on Time Series Anomaly Detection

Alex Koran

Hadi Hojjati

Narges Armanfard

Deep learning for time-series anomaly detection (TSAD) has gained significant attention over the past decade. Despite the reported improveme… (see more)nts in several papers, the practical application of these models remains limited. Recent studies have cast doubt on these models, attributing their results to flawed evaluation techniques. However, the impact of initialization has largely been overlooked. This paper provides a critical analysis of the initialization effects on TSAD model performance. Our extensive experiments reveal that TSAD models are highly sensitive to hyperparameters such as window size, seed number, and normalization. This sensitivity often leads to significant variability in performance, which can be exploited to artificially inflate the reported efficacy of these models. We demonstrate that even minor changes in initialization parameters can result in performance variations that overshadow the claimed improvements from novel model architectures. Our findings highlight the need for rigorous evaluation protocols and transparent reporting of preprocessing steps to ensure the reliability and fairness of anomaly detection methods. This paper calls for a more cautious interpretation of TSAD advancements and encourages the development of more robust and transparent evaluation practices to advance the field and its practical applications.

2024-08-12

ArXiv (preprint)

doi.org

arxiv.org

Feasibility and safety of endoscopic ultrasound-guided diffusing alpha emitter radiation therapy for advanced pancreatic cancer: Preliminary data

Corey S Miller

Magali Lecavalier-Barsoum

Kim Ma

Miriam Santos Dutra

Youri Kaitoukov

Boris Bahoric

Nada Tomic

Francine Dinelle

Shirin Enger

Gerald Batist

Stephen Yang

Donald Laporta

Petr Kavan

Anand Sahai

David Roberge

David Donath

Abstract Background and study aims Pancreatic cancer is a devastating disease with limited locoregional treatment options. Diffusing alpha-e… (see more)mitter radiation therapy (Alpha DaRT), a novel cancer treatment using alpha-particle interstitial radiotherapy, may help address this challenge. The aim of this study was to evaluate the feasibility and safety of endoscopic ultrasound (EUS)-guided Alpha DaRT for advanced pancreatic cancer. Patients and methods Patients with inoperable locally advanced or metastatic pancreatic adenocarcinoma were treated with EUS-guided Alpha DaRT insertion. The Alpha DaRT sources were delivered into pancreatic tumors using a standard EUS needle with a novel proprietary applicator. Adverse events (AEs) were assessed based on the Common Terminology Criteria for Adverse Events version 5.0. Tumor response was evaluated by imaging 4 to 6 weeks post treatment. Results The first five patients were treated between March and September 2023. The procedure was technically successful in all cases, with Alpha DaRT sources inserted into the target tumor. Estimated gross tumor volume coverage ranged from 8% to 44%. Fourteen AEs were reported among three patients. Four were serious AEs, none of which was associated with the treatment, but rather, with disease progression or medical assistance in dying. Only two AEs (mild) were deemed possibly related to the study device. At the 35-day visit, two patients had progressive disease and three had stable disease, with one of the latter showing partial response 2 months post procedure. Conclusions Preliminary results from this first-in-human trial indicate that EUS-guided Alpha DaRT treatment for unresectable pancreatic cancer is feasible and safe, with no device-associated serious AEs. Further investigation of this promising novel modality is underway.

2024-08-11

Endoscopy International Open (published)

doi.org

Oxygen thresholds in critically ill patients: need for personalized targets. Author's reply.

Guillaume Dumas

Laveena Munshi

2024-08-11

Intensive Care Medicine (published)

doi.org

Revisiting Feature Prediction for Learning Visual Representations from Video

Adrien Bardes

Quentin Garrido

Jean Ponce

Xinlei Chen

Michael G. Rabbat

Yann Lecun

Mahmoud Assran

Nicolas Ballas

2024-08-08

TMLR (accepted)

doi.org

openreview.net

Cardinality Minimization, Constraints, and Regularization: A Survey

Andreas M. Tillmann

Daniel Bienstock

Andrea Lodi

Alexandra Schwartz

We survey optimization problems that involve the cardinality of variable vectors in constraints or the objective function. We provide a unif… (see more)ied viewpoint on the general problem classes and models, and give concrete examples from diverse application fields such as signal and image processing, portfolio selection, or machine learning. The paper discusses general-purpose modeling techniques and broadly applicable as well as problem-specific exact and heuristic solution approaches. While our perspective is that of mathematical optimization, a main goal of this work is to reach out to and build bridges between the different communities in which cardinality optimization problems are frequently encountered. In particular, we highlight that modern mixed-integer programming, which is often regarded as impractical due to commonly unsatisfactory behavior of black-box solvers applied to generic problem formulations, can in fact produce provably high-quality or even optimal solutions for cardinality optimization problems, even in large-scale real-world settings. Achieving such performance typically draws on the merits of problem-specific knowledge that may stem from different fields of application and, e.g., shed light on structural properties of a model or its solutions, or lead to the development of efficient heuristics; we also provide some illustrative examples.

2024-08-07

SIAM Review (published)

doi.org

arxiv.org

Large language models auto-profile conscious awareness changes under psychedelic drug effects

Danilo Bzdok

Robin Carhart-Harris

Chloé Savignac

Gregory Bell

Steven Laureys

Abstract

Psychedelic experiences open a colorful view into drug-induced changes in conscious awareness. Small-samp… (see more)le studies on psychedelic drug action have gained traction in recent years. Yet, today’s means for measuring changes in subjective experience are mostly limited to legacy questionnaires of pre-assumed relevance, which could be complemented by bottom-up explorations of semantic facets that underlie experience reports. Here, we show how to harness large language models (LLMs) to i) design from scratch, ii) annotate at scale, and iii) evaluate with rigor a vast portfolio of experience dimensions during psychoactive drug influence, yielding > 2 million automatic dimension ratings that would otherwise have been done by hand. Investigator-independent LLM scoring of these drug effects on the human mind alone allowed to robustly discriminate the unique mental effects of 30 psychoactive substances. Successful knowledge integration of how psychedelics mediate shifts in subjective awareness will be an unavoidable milestone towards charting the full drug design space.

2024-08-07

Research Square (preprint)

doi.org

Nonlinear latent representations of high-dimensional task-fMRI data: Unveiling cognitive and behavioral insights in heterogeneous spatial maps

Mariam Zabihi

Seyed Mostafa Kia

Thomas Wolfers

Stijn de Boer

Charlotte Fraza

Richard Dinga

Alberto Llera Arenas

Danilo Bzdok

Christian F. Beckmann

Andre Marquand

Finding an interpretable and compact representation of complex neuroimaging data is extremely useful for understanding brain behavioral mapp… (see more)ing and hence for explaining the biological underpinnings of mental disorders. However, hand-crafted representations, as well as linear transformations, may inadequately capture the considerable variability across individuals. Here, we implemented a data-driven approach using a three-dimensional autoencoder on two large-scale datasets. This approach provides a latent representation of high-dimensional task-fMRI data which can account for demographic characteristics whilst also being readily interpretable both in the latent space learned by the autoencoder and in the original voxel space. This was achieved by addressing a joint optimization problem that simultaneously reconstructs the data and predicts clinical or demographic variables. We then applied normative modeling to the latent variables to define summary statistics (‘latent indices’) and establish a multivariate mapping to non-imaging measures. Our model, trained with multi-task fMRI data from the Human Connectome Project (HCP) and UK biobank task-fMRI data, demonstrated high performance in age and sex predictions and successfully captured complex behavioral characteristics while preserving individual variability through a latent representation. Our model also performed competitively with respect to various baseline models including several variants of principal components analysis, independent components analysis and classical regions of interest, both in terms of reconstruction accuracy and strength of association with behavioral variables.

2024-08-07

Public Library of Science ONE (published)

doi.org

Stochastic Wiring of Cell Types Enhances Fitness by Generating Phenotypic Variability

Divyansha Lachi

Ann Huang

Augustine N. Mavor-Parker

Arna Ghosh

Blake Richards

Anthony Zador

The development of neural connectivity is a crucial biological process that gives rise to diverse brain circuits and behaviors. Neural devel… (see more)opment is a stochastic process, but this stochasticity is often treated as a nuisance to overcome rather than as a functional advantage. Here we use a computational model, in which connection probabilities between discrete cell types are genetically specified, to investigate the benefits of stochasticity in the development of neural wiring. We show that this model can be viewed as a generalization of a powerful class of artificial neural networks—Bayesian neural networks—where each network parameter is a sample from a distribution. Our results reveal that stochasticity confers a greater benefit in large networks and variable environments, which may explain its role in organisms with larger brains. Surprisingly, we find that the average fitness over a population of agents is higher than a single agent defined by the average connection probability. Our model reveals how developmental stochasticity, by inducing a form of non-heritable phenotypic variability, can increase the probability that at least some individuals will survive in rapidly changing, unpredictable environments. Our results suggest how stochasticity may be an important feature rather than a bug in neural development.

2024-08-07

bioRxiv (preprint)

doi.org

Mila Ventures Launchpad

Mila on Udemy

AI Policy Fellowship Publications

Publications

Mila Ventures Launchpad

Mila on Udemy

AI Policy Fellowship Publications

Popular keywords:

Publications