Tegan Maharaj

Google Scholar

Biography

I am an assistant professor at the Department of Decision Science at HEC Montréal.

The goal of my research is to contribute understanding and techniques to the growing science of responsible AI development, while usefully applying AI to high-impact ecological problems related to climate change, epidemiology, AI alignment and ecological impact assessments. My recent research has two themes: (1) using deep models for policy analysis and risk mitigation, and (2) designing data or unit test environments to empirically evaluate learning behaviour or simulate the deployment of AI systems. Please contact me if you are interested in collaborations in these areas.

I am generally interested in studying “what goes into” deep models—not only data, but also the broader learning environment (e.g., task design/specification, loss function and regularization) and the broader societal context of deployment (e.g., privacy considerations, trends and incentives, norms and human biases). I am concerned and passionate about AI ethics and safety, and the application of ML to environmental management, health and social welfare.

Current Students

Carol Altimas

Master's Research - Université de Montréal

Principal supervisor :

Étienne Laliberté

Yanis Bencheikh

Master's Research - HEC Montréal

Github

Nilly Bozorgzad

Master's Research - HEC Montréal

Github

Maël Simon

PhD - HEC Montréal

Github

Heejun Yoon

PhD - HEC Montréal

Publications

The Singapore Consensus on Global AI Safety Research Priorities

Luke Ong

Stuart Russell

Dawn Song

Max Tegmark

Lan Xue

Ya-Qin Zhang

Stephen Casper

Wan Sie Lee

Sören Mindermann

Vanessa Wilfred

Vidhisha Balachandran

Fazl Barez

Michael Belinsky

Imane Bello

Malo Bourgon

Mark Brakel

Simeon Campos

Duncan Cass-Beggs … (see 68 more)

Jiahao Chen

Rumman Chowdhury

Kuan Chua Seah

Jeff Clune

Juntao Dai

Agnes Delaborde

Nouha Dziri

Francisco Eiras

Joshua Engels

Jinyu Fan

Adam Gleave

Noah Goodman

Fynn Heide

Johannes Heidecke

Dan Hendrycks

Cyrus Hodes

Bryan Low Kian Hsiang

Minlie Huang

Sami Jawhar

Wang Jingyu

Adam Tauman Kalai

Meindert Kamphuis

Mohan Kankanhalli

Subhash Kantamneni

Mathias Bonde Kirk

Thomas Kwa

Jeffrey Ladish

Kwok-Yan Lam

Wan Lee Sie

Taewhi Lee

Xiaojian Li

Jiajun Liu

Chaochao Lu

Yifan Mai

Richard Mallah

Julian Michael

Nick Moës

Simon Möller

Kihyuk Nam

Kwan Yee Ng

Mark Nitzberg

Besmira Nushi

Seán Ó hÉigeartaigh

Alejandro Ortega

Pierre Peigné

James Petrie

Benjamin Prud'homme

Reihaneh Rabbany

Nayat Sanchez-Pi

Sarah Schwettmann

Buck Shlegeris

Saad Siddiqui

Aradhana Sinha

Martín Soto

Cheston Tan

Dong Ting

William Tjhi

Robert Trager

Brian Tse

Anthony Tung K. H.

Vanessa Wilfred

John Willes

Denise Wong

Wei Xu

Rongwu Xu

Yi Zeng

HongJiang Zhang

Djordje Žikelić

Rapidly improving AI capabilities and autonomy hold significant promise of transformation, but are also driving vigorous debate on how to en… (see more)sure that AI is safe, i.e., trustworthy, reliable, and secure. Building a trusted ecosystem is therefore essential – it helps people embrace AI with confidence and gives maximal space for innovation while avoiding backlash. This requires policymakers, industry, researchers and the broader public to collectively work toward securing positive outcomes from AI’s development. AI safety research is a key dimension. Given that the state of science today for building trustworthy AI does not fully cover all risks, accelerated investment in research is required to keep pace with commercially driven growth in system capabilities. Goals: The 2025 Singapore Conference on AI (SCAI): International Scientific Exchange on AI Safety aims to support research in this important space by bringing together AI scientists across geographies to identify and synthesise research priorities in AI safety. The result, The Singapore Consensus on Global AI Safety Research Priorities, builds on the International AI Safety Report-A (IAISR) chaired by Yoshua Bengio and backed by 33 governments. By adopting a defence-in-depth model, this document organises AI safety research domains into three types: challenges with creating trustworthy AI systems (Development), challenges with evaluating their risks (Assessment), and challenges with monitoring and intervening after deployment (Control). Through the Singapore Consensus, we hope to globally facilitate meaningful conversations between AI scientists and AI policymakers for maximally beneficial outcomes. Our goal is to enable more impactful R&D efforts to rapidly develop safety and evaluation mechanisms and foster a trusted ecosystem where AI is harnessed for the public good.

2024-12-31

arXiv (preprint)

Quantifying Likeness: A Simple Machine Learning Approach to Identifying Copyright Infringement in (AI-Generated) Artwork

Michaela Drouillard

Ryan Spencer

Nikée Nantambu-Allen

This study proposes an approach aligned with the legal process to quantify copyright infringement, via stylistic similarity, in AI-generated… (see more) artwork. In contrast to typical work in this field, and more in line with a realistic legal setting, our approach quantifies the similarity of a set of potentially-infringing “defendant” artworks to a set of copyrighted “plaintiff" artworks. We frame this as an image classification task, using a fine-tuned ResNet trained on small, customized datasets relevant to each use case. Softmax-normalized probabilities from the model serve as similarity scores for potentially infringing “defendant” artworks, and saliency maps and features visualizations complement the score by highlighting key features and allowing for interpretability. This straightforward image classification approach can be accomplished in a quite simple, low-resource setting, making it accessible for real-world applications. We present a case study using Mickey Mouse as the plaintiff, performing thorough hyperparameter tuning and robustness analysis. Our experiments include optimizing batch size, weight decay, and learning rate, as well as exploring the impact of additional distractor classes. We employ data augmentation, cross-validation, and a linear decay learning rate scheduler to improve model performance, along with conducting scaling experiments with different types of distractor classes. The aims of this work are to illustrate the potential of the approach, and identify settings which generalize well, such that it is as "plug and play" as possible for users to apply with their own plaintiff sets of artworks.

2024-10-11

NeurIPS.cc/2024/Workshop/SafeGenAi (poster)

The State of Data Curation at NeurIPS: An Assessment of Dataset Development Practices in the Datasets and Benchmarks Track

Eshta Bhardwaj

Harshit Gujral

Siyi Wu

Ciara Zogheib

Christoph Becker

Data curation is a field with origins in librarianship and archives, whose scholarship and thinking on data issues go back centuries, if not… (see more) millennia. The field of machine learning is increasingly observing the importance of data curation to the advancement of both applications and fundamental understanding of machine learning models - evidenced not least by the creation of the Datasets and Benchmarks track itself. This work provides an analysis of dataset development practices at NeurIPS through the lens of data curation. We present an evaluation framework for dataset documentation, consisting of a rubric and toolkit developed through a literature review of data curation principles. We use the framework to assess the strengths and weaknesses in current dataset development practices of 60 datasets published in the NeurIPS Datasets and Benchmarks track from 2021-2023. We summarize key findings and trends. Results indicate greater need for documentation about environmental footprint, ethical considerations, and data management. We suggest targeted strategies and resources to improve documentation in these areas and provide recommendations for the NeurIPS peer-review process that prioritize rigorous data curation in ML. Finally, we provide results in the format of a dataset that showcases aspects of recommended data curation practices. Our rubric and results are of interest for improving data curation practices broadly in the field of ML as well as to data curation and science and technology studies scholars studying practices in ML. Our aim is to support continued improvement in interdisciplinary research on dataset practices, ultimately improving the reusability and reproducibility of new datasets and benchmarks, enabling standardized and informed human oversight, and strengthening the foundation of rigorous and responsible ML research.

2024-09-25

NeurIPS.cc/2024/Datasets_and_Benchmarks_Track (spotlight)

Meta- (out-of-context) learning in neural networks

Dmitrii Krasheninnikov

Egor Krasheninnikov

Bruno Mlodozeniec

David M. Krueger

Brown et al. (2020) famously introduced the phenomenon of in-context learning in large language models (LLMs). We establish the existence of… (see more) a phenomenon we call meta-out-of-context learning (meta-OCL) via carefully designed synthetic experiments with LLMs. Our results suggest that meta-OCL leads LLMs to more readily"internalize"the semantic content of text that is, or appears to be, broadly useful (such as true statements, or text from authoritative sources) and use it in appropriate circumstances. We further demonstrate meta-OCL in a synthetic computer vision setting, and propose two hypotheses for the emergence of meta-OCL: one relying on the way models store knowledge in their parameters, and another suggesting that the implicit gradient alignment bias of gradient-descent-based optimizers may be responsible. Finally, we reflect on what our results might imply about capabilities of future AI systems, and discuss potential risks. Our code can be found at https://github.com/krasheninnikov/internalization.

2024-07-07

Proceedings of the 41st International Conference on Machine Learning (published)

proceedings.mlr.press

Machine Learning Data Practices through a Data Curation Lens: An Evaluation Framework

Eshta Bhardwaj

Harshit Gujral

Siyi Wu

Ciara Zogheib

Christoph Becker

Studies of dataset development in machine learning call for greater attention to the data practices that make model development possible and… (see more) shape its outcomes. Many argue that the adoption of theory and practices from archives and data curation fields can support greater fairness, accountability, transparency, and more ethical machine learning. In response, this paper examines data practices in machine learning dataset development through the lens of data curation. We evaluate data practices in machine learning as data curation practices. To do so, we develop a framework for evaluating machine learning datasets using data curation concepts and principles through a rubric. Through a mixed-methods analysis of evaluation results for 25 ML datasets, we study the feasibility of data curation principles to be adopted for machine learning data work in practice and explore how data curation is currently performed. We find that researchers in machine learning, which often emphasizes model development, struggle to apply standard data curation principles. Our findings illustrate difficulties at the intersection of these fields, such as evaluating dimensions that have shared terms in both fields but non-shared meanings, a high degree of interpretative flexibility in adapting concepts without prescriptive restrictions, obstacles in limiting the depth of data curation expertise needed to apply the rubric, and challenges in scoping the extent of documentation dataset creators are responsible for. We propose ways to address these challenges and develop an overall framework for evaluation that outlines how data curation concepts and methods can inform machine learning data practices.

2024-06-04

The 2024 ACM Conference on Fairness, Accountability, and Transparency (published)

Methods, Applications, and Directions of Learning-to-Rank in NLP Research

Justin Lee

Gabriel Bernier-Colborne

Sowmya Vajjala

Learning-to-rank (LTR) algorithms aim to order a set of items according to some criteria. They are at the core of applications such as web s… (see more)earch and social media recommendations, and are an area of rapidly increasing interest, with the rise of large language models (LLMs) and the widespread impact of these technologies on society. In this paper, we survey the diverse use cases of LTR methods in natural language processing (NLP) research, looking at previously under-studied aspects such as multilingualism in LTR applications and statistical significance testing for LTR problems. We also consider how large language models are changing the LTR landscape. This survey is aimed at NLP researchers and practitioners interested in understanding the formalisms and best practices regarding the application of LTR approaches in their research.

2024-05-31

Findings of the Association for Computational Linguistics: NAACL 2024 (published)

Beyond Predictive Algorithms in Child Welfare

Erina Seh-Young Moon

Erin Moon

Devansh Saxena

Shion Guha

2024-01-22

graphicsinterface.org/Graphics_Interface/2024/Conference (published)

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

Usman Anwar

Abulhair Saparov

Javier Rando

Daniel Paleka

Miles Turpin

Peter Hase

Ekdeep Singh Lubana

Erik Jenner

Stephen Casper

Oliver Sourbut

Benjamin L. Edelman

Zhaowei Zhang

Mario Günther

Anton Korinek

Jose Hernandez-Orallo

Lewis Hammond

Eric Bigelow

Alexander Pan

Lauro Langosco

Tomasz Korbak … (see 22 more)

Heidi Zhang

Ruiqi Zhong

Seán Ó hÉigeartaigh

Gabriel Recchia

Giulio Corsi

Alan Chan

Markus Anderljung

Lilian Edwards

Aleksandar Petrov

Christian Schroeder de Witt

Sumeet Ramesh Motwani

Samuel Albanie

Danqi Chen

Philip H.S. Torr

Jakob Foerster

Florian Tramèr

He He

Atoosa Kasirzadeh

Yejin Choi

David Krueger

This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are o… (see more)rganized into three different categories: scientific understanding of LLMs, development and deployment methods, and sociotechnical challenges. Based on the identified challenges, we pose

2023-12-31

Trans. Mach. Learn. Res. (published)

Managing extreme AI risks amid rapid progress

Geoffrey Hinton

Andrew Yao

Dawn Song

Pieter Abbeel

Yuval Noah Harari

Trevor Darrell

Ya-Qin Zhang

Lan Xue

Shai Shalev-Shwartz

Gillian Hadfield

Jeff Clune

Frank Hutter

Atilim Güneş Baydin

Sheila McIlraith

Qiqi Gao

Ashwin Acharya

David Krueger

Anca Dragan … (see 5 more)

Philip Torr

Stuart Russell

Daniel Kahneman

Jan Brauner

Sören Mindermann

Artificial Intelligence (AI) is progressing rapidly, and companies are shifting their focus to developing generalist AI systems that can aut… (see more)onomously act and pursue goals. Increases in capabilities and autonomy may soon massively amplify AI's impact, with risks that include large-scale social harms, malicious uses, and an irreversible loss of human control over autonomous AI systems. Although researchers have warned of extreme risks from AI, there is a lack of consensus about how exactly such risks arise, and how to manage them. Society's response, despite promising first steps, is incommensurate with the possibility of rapid, transformative progress that is expected by many experts. AI safety research is lagging. Present governance initiatives lack the mechanisms and institutions to prevent misuse and recklessness, and barely address autonomous systems. In this short consensus paper, we describe extreme risks from upcoming, advanced AI systems. Drawing on lessons learned from other safety-critical technologies, we then outline a comprehensive plan combining technical research and development with proactive, adaptive governance mechanisms for a more commensurate preparation.

2023-10-25

Science (published)

Harms from Increasingly Agentic Algorithmic Systems

Alan Chan

Rebecca Salganik

ALVA MARKELIUS

CHRIS PANG

Nitarshan Rajkumar

Dmitrii Krasheninnikov

Lauro Langosco

ZHONGHAO HE

Yawen Duan

MICAH CARROLL

Michelle Lin

ALEX MAYHEW

KATHERINE COLLINS

Maryam Molamohammadi

John Burden

WANRU ZHAO

Shalaleh Rismani

KONSTANTINOS VOUDOURIS

UMANG BHATT

Adrian Weller … (see 2 more)

David Krueger

Research in Fairness, Accountability, Transparency, and Ethics (FATE) has established many sources and forms of algorithmic harm, in domains… (see more) as diverse as health care, finance, policing, and recommendations. Much work remains to be done to mitigate the serious harms of these systems, particularly those disproportionately affecting marginalized communities. Despite these ongoing harms, new systems are being developed and deployed which threaten the perpetuation of the same harms and the creation of novel ones. In response, the FATE community has emphasized the importance of anticipating harms. Our work focuses on the anticipation of harms from increasingly agentic systems. Rather than providing a definition of agency as a binary property, we identify 4 key characteristics which, particularly in combination, tend to increase the agency of a given algorithmic system: underspecification, directness of impact, goal-directedness, and long-term planning. We also discuss important harms which arise from increasing agency -- notably, these include systemic and/or long-range impacts, often on marginalized stakeholders. We emphasize that recognizing agency of algorithmic systems does not absolve or shift the human responsibility for algorithmic harms. Rather, we use the term agency to highlight the increasingly evident fact that ML systems are not fully under human control. Our work explores increasingly agentic algorithmic systems in three parts. First, we explain the notion of an increase in agency for algorithmic systems in the context of diverse perspectives on agency across disciplines. Second, we argue for the need to anticipate harms from increasingly agentic systems. Third, we discuss important harms from increasingly agentic systems and ways forward for addressing them. We conclude by reflecting on implications of our work for anticipating algorithmic harms from emerging systems.

2023-06-11

2023 ACM Conference on Fairness, Accountability, and Transparency (published)

Proactive Contact Tracing

Prateek Gupta

Martin Weiss

Nasim Rahaman

Hannah Alsdurf

Nanor Minoyan

Soren Harnois-Leblanc

Joanna Merckx

Andrew Williams

Victor Schmidt

Pierre-Luc St-Charles

Akshay Patel

Yang Zhang

David L. Buckeridge

Christopher Pal

Bernhard Schölkopf

The COVID-19 pandemic has spurred an unprecedented demand for interventions that can reduce disease spread without excessively restricting d… (see more)aily activity, given negative impacts on mental health and economic outcomes. Digital contact tracing (DCT) apps have emerged as a component of the epidemic management toolkit. Existing DCT apps typically recommend quarantine to all digitally-recorded contacts of test-confirmed cases. Over-reliance on testing may, however, impede the effectiveness of such apps, since by the time cases are confirmed through testing, onward transmissions are likely to have occurred. Furthermore, most cases are infectious over a short period; only a subset of their contacts are likely to become infected. These apps do not fully utilize data sources to base their predictions of transmission risk during an encounter, leading to recommendations of quarantine to many uninfected people and associated slowdowns in economic activity. This phenomenon, commonly termed as “pingdemic,” may additionally contribute to reduced compliance to public health measures. In this work, we propose a novel DCT framework, Proactive Contact Tracing (PCT), which uses multiple sources of information (e.g. self-reported symptoms, received messages from contacts) to estimate app users’ infectiousness histories and provide behavioral recommendations. PCT methods are by design proactive, predicting spread before it occurs. We present an interpretable instance of this framework, the Rule-based PCT algorithm, designed via a multi-disciplinary collaboration among epidemiologists, computer scientists, and behavior experts. Finally, we develop an agent-based model that allows us to compare different DCT methods and evaluate their performance in negotiating the trade-off between epidemic control and restricting population mobility. Performing extensive sensitivity analysis across user behavior, public health policy, and virological parameters, we compare Rule-based PCT to i) binary contact tracing (BCT), which exclusively relies on test results and recommends a fixed-duration quarantine, and ii) household quarantine (HQ). Our results suggest that both BCT and Rule-based PCT improve upon HQ, however, Rule-based PCT is more efficient at controlling spread of disease than BCT across a range of scenarios. In terms of cost-effectiveness, we show that Rule-based PCT pareto-dominates BCT, as demonstrated by a decrease in Disability Adjusted Life Years, as well as Temporary Productivity Loss. Overall, we find that Rule-based PCT outperforms existing approaches across a varying range of parameters. By leveraging anonymized infectiousness estimates received from digitally-recorded contacts, PCT is able to notify potentially infected users earlier than BCT methods and prevent onward transmissions. Our results suggest that PCT-based applications could be a useful tool in managing future epidemics.

2023-03-12

PLOS Digital Health (published)

Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics

Shoaib Ahmed Siddiqui

Nitarshan Rajkumar

David M. Krueger

Sara Hooker

Modern machine learning research relies on relatively few carefully curated datasets. Even in these datasets, and typically in `untidy' or r… (see more)aw data, practitioners are faced with significant issues of data quality and diversity which can be prohibitively labor intensive to address. Existing methods for dealing with these challenges tend to make strong assumptions about the particular issues at play, and often require a priori knowledge or metadata such as domain labels. Our work is orthogonal to these methods: we instead focus on providing a unified and efficient framework for Metadata Archaeology -- uncovering and inferring metadata of examples in a dataset. We curate different subsets of data that might exist in a dataset (e.g. mislabeled, atypical, or out-of-distribution examples) using simple transformations, and leverage differences in learning dynamics between these probe suites to infer metadata of interest. Our method is on par with far more sophisticated mitigation methods across different tasks: identifying and correcting mislabeled examples, classifying minority-group samples, prioritizing points relevant for training and enabling scalable human auditing of relevant examples.

2023-01-31

ICLR.cc/2023/Conference (notable)