Publications

SandboxSocial: A Sandbox for Social Media Using Multimodal AI Agents
Gayatri Krishnakumar
Busra Tugce Gurbuz
Austin Welch
Hao Yu
Ethan Kosak-Hine
Tom Gibbs
Dan Zhao
The online information ecosystem enables influence campaigns of unprecedented scale and impact. We urgently need empirically grounded approa… (see more)ches to counter the growing threat of malicious campaigns, now amplified by generative AI. But, developing defenses in real-world settings is impractical. Social system simulations with agents modelled using Large Language Models (LLMs) are a promising alternative approach and a growing area of research. However, existing simulators lack features needed to capture the complex information-sharing dynamics of platform-based social networks. To bridge this gap, we present SandboxSocial, a new simulator that includes several key innovations, mainly: (1) a virtual social media platform (modelled as Mastodon and mirrored in an actual Mastodon server) that enables a realistic setting in which agents interact; (2) an adapter that uses real-world user data to create more grounded agents and social media content; and (3) multi-modal capabilities that enable our agents to interact using both text and images---just as humans do on social media. We make the simulator more useful to researchers by providing measurement and analysis tools that track simulation dynamics and compute evaluation metrics to compare experimental results.
Veracity: An Open-Source AI Fact-Checking System
The proliferation of misinformation poses a significant threat to society, exacerbated by the capabilities of generative AI. This demo paper… (see more) introduces Veracity, an open-source AI system designed to empower individuals to combat misinformation through transparent and accessible fact-checking. Veracity leverages the synergy between Large Language Models (LLMs) and web retrieval agents to analyze user-submitted claims and provide grounded veracity assessments with intuitive explanations. Key features include multilingual support, numerical scoring of claim veracity, and an interactive interface inspired by familiar messaging applications. This paper will showcase Veracity's ability to not only detect misinformation but also explain its reasoning, fostering media literacy and promoting a more informed society.
Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?
Michael Cohen
Joumana Ghosn
Adam Oberman
Jesse Richardson
Oliver Richardson
Marc-Antoine Rondeau
Pierre-Luc St-Charles
David Williams-King
The leading AI companies are increasingly focused on building generalist AI agents -- systems that can autonomously plan, act, and pursue go… (see more)als across almost all tasks that humans can perform. Despite how useful these systems might be, unchecked AI agency poses significant risks to public safety and security, ranging from misuse by malicious actors to a potentially irreversible loss of human control. We discuss how these risks arise from current AI training methods. Indeed, various scenarios and experiments have demonstrated the possibility of AI agents engaging in deception or pursuing goals that were not specified by human operators and that conflict with human interests, such as self-preservation. Following the precautionary principle, we see a strong need for safer, yet still useful, alternatives to the current agency-driven trajectory. Accordingly, we propose as a core building block for further advances the development of a non-agentic AI system that is trustworthy and safe by design, which we call Scientist AI. This system is designed to explain the world from observations, as opposed to taking actions in it to imitate or please humans. It comprises a world model that generates theories to explain data and a question-answering inference machine. Both components operate with an explicit notion of uncertainty to mitigate the risks of overconfident predictions. In light of these considerations, a Scientist AI could be used to assist human researchers in accelerating scientific progress, including in AI safety. In particular, our system can be employed as a guardrail against AI agents that might be created despite the risks involved. Ultimately, focusing on non-agentic AI may enable the benefits of AI innovation while avoiding the risks associated with the current trajectory. We hope these arguments will motivate researchers, developers, and policymakers to favor this safer path.
Deep learning reveals that multidimensional social status drives population variation in 11,875 US participant cohort
As an increasing realization, many behavioral relationships are interwoven with inherent variations in human populations. Presently, there i… (see more)s no clarity in the biomedical community on which sources of population variation are most dominant. The recent advent of population-scale cohorts like the Adolescent Brain Cognitive DevelopmentSM Study (ABCD Study®) are now offering unprecedented depth and width of phenotype profiling that potentially explains interfamily differences. Here, we leveraged a deep learning framework (conditional variational autoencoder) on the totality of the ABCD Study® phenome (8,902 candidate phenotypes in 11,875 participants) to identify and characterize major sources of population stratification. 80% of the top 5 sources of explanatory stratifications were driven by distinct combinations of 202 available socioeconomic status (SES) measures; each in conjunction with a unique set of non-overlapping social and environmental factors. Several sources of variation across this cohort flagged geographies marked by material poverty interlocked with mental health and behavioral correlates. Deprivation emerged in another top stratification in relation to urbanicity and its ties to immigrant and racial and ethnic minoritized groups. Conversely, two other major sources of population variation were both driven by indicators of privilege: one highlighted measures of access to educational opportunity and income tied to healthy home environments and good behavior, the other profiled individuals of European ancestry leading advantaged lifestyles in desirable neighborhoods in terms of location and air quality. Overall, the disclosed social stratifications underscore the importance of treating SES as a multidimensional construct and recognizing its ties into social determinants of health.
Deep learning reveals that multidimensional social status drives population variation in 11,875 US participant cohort
As an increasing realization, many behavioral relationships are interwoven with inherent variations in human populations. Presently, there i… (see more)s no clarity in the biomedical community on which sources of population variation are most dominant. The recent advent of population-scale cohorts like the Adolescent Brain Cognitive DevelopmentSM Study (ABCD Study®) are now offering unprecedented depth and width of phenotype profiling that potentially explains interfamily differences. Here, we leveraged a deep learning framework (conditional variational autoencoder) on the totality of the ABCD Study® phenome (8,902 candidate phenotypes in 11,875 participants) to identify and characterize major sources of population stratification. 80% of the top 5 sources of explanatory stratifications were driven by distinct combinations of 202 available socioeconomic status (SES) measures; each in conjunction with a unique set of non-overlapping social and environmental factors. Several sources of variation across this cohort flagged geographies marked by material poverty interlocked with mental health and behavioral correlates. Deprivation emerged in another top stratification in relation to urbanicity and its ties to immigrant and racial and ethnic minoritized groups. Conversely, two other major sources of population variation were both driven by indicators of privilege: one highlighted measures of access to educational opportunity and income tied to healthy home environments and good behavior, the other profiled individuals of European ancestry leading advantaged lifestyles in desirable neighborhoods in terms of location and air quality. Overall, the disclosed social stratifications underscore the importance of treating SES as a multidimensional construct and recognizing its ties into social determinants of health.
Pathfinding: a neurodynamical account of intuition
Steven Kotler
Michael Mannino
Karl Friston
Gyorgy Buzsáki
J. A. Scott Kelso
WeDesign: Generative AI-Facilitated Community Consultations for Urban Public Space Design
Rashid A. Mushkani
Shin Koseki
LIPTER, a cardiomyocyte-enriched long noncoding RNA, controls cardiac cytoskeletal maturation and is regulated by a cardiomyocyte-specific enhancer.
George Anene Nzelu
Mick Lee
Svenja Koslowski
Wenhao Zheng
Marouane Benzaki
Michelle Mak
Xiaohua Wang
Lek Wen Tan
Albert Dashi
Talal Fawaz
Kenneth Ng
Davy Pham
Francis LeBlanc
Guillaume Lettre
Roger Foo
Cardiac development is characterized by a complex series of molecular, cytoskeletal and electrophysiological changes that guarantee the prop… (see more)er functioning of adult cardiomyocytes (CMs). These changes are defined by cell-type-specific transcriptional rewiring of progenitor cells to form CMs, and are regulated by various epigenetic elements, such as long noncoding RNAs (lncRNAs). LncRNAs are versatile epigenetic regulators as they may act in cis or in trans to orchestrate important gene programs during cardiac development and may concurrently encode micropeptides. LIPTER is one such lncRNA, previously shown to regulate lipid droplet transport in cardiomyocytes and thus an important regulator of cardiomyocyte metabolism. Here we show that LIPTER also plays a role in the cytoskeletal maturation of CMs, as loss of LIPTER leads to persistent expression of fetal genes, changes in chromatin accessibility, disorganized sarcomeres and impaired calcium homeostasis in CMs. Furthermore, we have identified a cardiomyocyte-specific regulatory enhancer that regulates the expression of LIPTER in CMs. CRISPR-mediated inhibition of this enhancer led to reduced LIPTER expression in CMs and increased expression of fetal genes. This CM-specific enhancer could therefore be manipulated to control the expression of LIPTER for therapeutic benefit. In summary, we have unravelled a novel role of LIPTER in CMs cytoskeletal maturation and have identified a CM-specific enhancer for LIPTER.
LIPTER, a cardiomyocyte-enriched long noncoding RNA, controls cardiac cytoskeletal maturation and is regulated by a cardiomyocyte-specific enhancer
Chukwuemeka George Anene-Nzelu
C.J Mick Lee
Svenja Koslowski
Wenhao Zheng
Marouane Benzaki
Michelle Mak
Xiaohua Wang
Lek Wen Tan
Albert Dashi
Talal Fawaz
Kenneth Ng
Davy Pham
Francis LeBlanc
Guillaume Lettre
Roger Foo
Multi-Modal Protein Representation Learning with CLASP
The Impact of a Pediatric Surgery Fundamentals Boot Camp on New Surgical Trainees' Perceived Knowledge and Confidence Levels.
Julia Ferreira
Simon Rahman
Fabio Botelho
Farhan Banji
W. A. Igrine
Gianluca Bertolizio
Sam Daniel
Thomas Engelhardt
Chantal Frigon
Lily H P Nguyen
Catherine Paquet
Pramod Puligandla
Hussein Wissanji
Davinia Withington
Yasmine Yousef
Sherif Emil
Adversarial Attack Classification and Robustness Testing for Large Language Models for Code
Yang Liu
Armstrong Foundjem
Heng Li
Large Language Models (LLMs) have become vital tools in software development tasks such as code generation, completion, and analysis. As the… (see more)ir integration into workflows deepens, ensuring robustness against vulnerabilities especially those triggered by diverse or adversarial inputs becomes increasingly important. Such vulnerabilities may lead to incorrect or insecure code generation when models encounter perturbed task descriptions, code, or comments. Prior research often overlooks the role of natural language in guiding code tasks. This study investigates how adversarial perturbations in natural language inputs including prompts, comments, and descriptions affect LLMs for Code (LLM4Code). It examines the effects of perturbations at the character, word, and sentence levels to identify the most impactful vulnerabilities. We analyzed multiple projects (e.g., ReCode, OpenAttack) and datasets (e.g., HumanEval, MBPP), establishing a taxonomy of adversarial attacks. The first dimension classifies the input type code, prompts, or comments while the second dimension focuses on granularity: character, word, or sentence-level changes. We adopted a mixed-methods approach, combining quantitative performance metrics with qualitative vulnerability analysis. LLM4Code models show varying robustness across perturbation types. Sentence-level attacks were least effective, suggesting models are resilient to broader contextual changes. In contrast, word-level perturbations posed serious challenges, exposing semantic vulnerabilities. Character-level effects varied, showing model sensitivity to subtle syntactic deviations.Our study offers a structured framework for testing LLM4Code robustness and emphasizes the critical role of natural language in adversarial evaluation. Improving model resilience to semantic-level disruptions is essential for secure and reliable code-generation systems.