Portrait of Adam M. Oberman is unavailable

Adam M. Oberman

Associate Academic Member
Canada CIFAR AI Chair
Full Professor, McGill University, Department of Mathematics and Statistics
Research Topics
AI Safety
Deep Learning
Generative Models
Machine Learning Theory
Representation Learning

Biography

I am a professor at McGill University, in the Department of Mathematics and Statistics. My research revolves around the application of advanced mathematical techniques to the field of deep learning. My primary areas of expertise include generative modelling, stochastic optimization methods, fairness/bias removal in computer vision, and generalization in reinforcement learning.

Before joining McGill in 2012, I held a tenured faculty position at Simon Fraser University and completed a postdoctoral fellowship at the University of Texas, Austin. I obtained my undergraduate education at the University of Toronto and pursued graduate studies at the University of Chicago. I have also held visiting positions at the University of California, Los Angeles (UCLA) and at the National Institute for Research in Digital Science and Technology (INRIA) in Paris.

My early research encompassed the fields of partial differential equations and scientific computing, where I made significant contributions to areas like numerical optimal transportation, geometric PDEs and stochastic control problems.

I teach two comprehensive theory courses on machine learning, covering topics such as statistical learning theory and kernel theory.

For prospective graduate students interested in working with me, please apply to both Mila – Quebec Artificial Intelligence Institute and the Department of Mathematics and Statistics at McGill. Alternatively, applicants may consider co-supervision opportunities with advisors from the computer science program at McGill or Université de Montréal.

Current Students

Master's Research - McGill University
Postdoctorate - McGill University
Co-supervisor :
Independent visiting researcher - University of Technology Sydney
PhD - McGill University
Co-supervisor :
PhD - McGill University
PhD - Université de Montréal
Principal supervisor :

Blog Posts

by
Tiago Salvador
Stephanie Cairns
Vikram Voleti

Publications

Can Safety Fine-Tuning Be More Principled? Lessons Learned from Cybersecurity
David Williams-King
Linh Le
As LLMs develop increasingly advanced capabilities, there is an increased need to minimize the harm that could be caused to society by certa… (see more)in model outputs; hence, most LLMs have safety guardrails added, for example via fine-tuning. In this paper, we argue the position that current safety fine-tuning is very similar to a traditional cat-and-mouse game (or arms race) between attackers and defenders in cybersecurity. Model jailbreaks and attacks are patched with bandaids to target the specific attack mechanism, but many similar attack vectors might remain. When defenders are not proactively coming up with principled mechanisms, it becomes very easy for attackers to sidestep any new defenses. We show how current defenses are insufficient to prevent new adversarial jailbreak attacks, reward hacking, and loss of control problems. In order to learn from past mistakes in cybersecurity, we draw analogies with historical examples and develop lessons learned that can be applied to LLM safety. These arguments support the need for new and more principled approaches to designing safe models, which are architected for security from the beginning. We describe several such approaches from the AI literature.
Can Safety Fine-Tuning Be More Principled? Lessons Learned from Cybersecurity
David Williams-King
Linh Le
As LLMs develop increasingly advanced capabilities, there is an increased need to minimize the harm that could be caused to society by certa… (see more)in model outputs; hence, most LLMs have safety guardrails added, for example via fine-tuning. In this paper, we argue the position that current safety fine-tuning is very similar to a traditional cat-and-mouse game (or arms race) between attackers and defenders in cybersecurity. Model jailbreaks and attacks are patched with bandaids to target the specific attack mechanism, but many similar attack vectors might remain. When defenders are not proactively coming up with principled mechanisms, it becomes very easy for attackers to sidestep any new defenses. We show how current defenses are insufficient to prevent new adversarial jailbreak attacks, reward hacking, and loss of control problems. In order to learn from past mistakes in cybersecurity, we draw analogies with historical examples and develop lessons learned that can be applied to LLM safety. These arguments support the need for new and more principled approaches to designing safe models, which are architected for security from the beginning. We describe several such approaches from the AI literature.
Harnessing small projectors and multiple views for efficient vision pretraining
Kumar Krishna Agrawal
Arna Ghosh
Shagun Sodhani
Multi-Resolution Continuous Normalizing Flows
Vikram Voleti
Chris Finlay
Addressing Sample Inefficiency in Multi-View Representation Learning
Arna Ghosh
Kumar Krishna Agrawal
Shagun Sodhani
Harnessing small projectors and multiple views for efficient vision pretraining
Kumar Krishna Agrawal
Arna Ghosh
Shagun Sodhani
Recent progress in self-supervised (SSL) visual representation learning has led to the development of several different proposed frameworks … (see more)that rely on augmentations of images but use different loss functions. However, there are few theoretically grounded principles to guide practice, so practical implementation of each SSL framework requires several heuristics to achieve competitive performance. In this work, we build on recent analytical results to design practical recommendations for competitive and efficient SSL that are grounded in theory. Specifically, recent theory tells us that existing SSL frameworks are minimizing the same idealized loss, which is to learn features that best match the data similarity kernel defined by the augmentations used. We show how this idealized loss can be reformulated to a functionally equivalent loss that is more efficient to compute. We study the implicit bias of using gradient descent to minimize our reformulated loss function and find that using a stronger orthogonalization constraint with a reduced projector dimensionality should yield good representations. Furthermore, the theory tells us that approximating the reformulated loss should be improved by increasing the number of augmentations, and as such using multiple augmentations should lead to improved convergence. We empirically verify our findings on CIFAR, STL and Imagenet datasets, wherein we demonstrate an improved linear readout performance when training a ResNet-backbone using our theoretically grounded recommendations. Remarkably, we also demonstrate that by leveraging these insights, we can reduce the pretraining dataset size by up to 2
Deep PDE Solvers for Subgrid Modelling and Out-of-Distribution Generalization
Patrick Chatain
EuclidNets: An Alternative Operation for Efficient Inference of Deep Learning Models
Xinlin Li
Mariana Parazeres
Alireza Ghaffari
Masoud Asgharian
Vahid Nia
A Reproducible and Realistic Evaluation of Partial Domain Adaptation Methods
Tiago Salvador
Kilian FATRAS
Unsupervised Domain Adaptation (UDA) aims at classifying unlabeled target images leveraging source labeled ones. In the case of an extreme l… (see more)abel shift scenario between the source and target domains, where we have extra source classes not present in the target domain, the UDA problem becomes a harder problem called Partial Domain Adaptation (PDA). While different methods have been developed to solve the PDA problem, most successful algorithms use model selection strategies that rely on target labels to find the best hyper-parameters and/or models along training. These strategies violate the main assumption in PDA: only unlabeled target domain samples are available. In addition, there are also experimental inconsistencies between developed methods - different architectures, hyper-parameter tuning, number of runs - yielding unfair comparisons. The main goal of this work is to provide a realistic evaluation of PDA methods under different model selection strategies and a consistent evaluation protocol. We evaluate 6 state-of-the-art PDA algorithms on 2 different real-world datasets using 7 different model selection strategies. Our two main findings are: (i) without target labels for model selection, the accuracy of the methods decreases up to 30 percentage points; (ii) only one method and model selection pair performs well on both datasets. Experiments were performed with our PyTorch framework, BenchmarkPDA, which we open source.
A principled approach for generating adversarial images under non-smooth dissimilarity metrics
Aram-Alexandre Pooladian
Chris Finlay
Tim Hoheisel