The Mila AI Policy Fellowship translates deep AI expertise into rigorous, public-interest policy. Read the newest publication Bridging the Expertise Gap: Knowledge Transfer Mechanisms for AI Regulation by Moritz von Knebel
This program supports AI startups at any time of the year. Benefit from cutting-edge resources and tailored support to accelerate your technology's development.
We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Publications
Simultaneous detection and estimation in olfactory sensing
The mammalian olfactory system shows an exceptional ability for rapid and accurate decoding of both the identity and concentration of odoran… (see more)ts. Previous works have used the theory of compressed sensing to elucidate the algorithmic basis for this capability: decoding odor information from the responses of a restricted repertoire of receptors is possible because only a few relevant odorants are present in any given sensory scene. However, existing circuit models for olfactory decoding still cannot contend with the complexity of naturalistic olfactory scenes; they are limited to detection of a handful of odorants. Here, we propose a model for olfactory compressed sensing inspired by simultaneous localization and mapping algorithms in navigation: the set of odors that are present in a given scene, and the concentration of those present odors, are inferred separately. To enable rapid inference of odor presence in a biologically-plausible recurrent circuit, our model leverages the framework of Mirrored Langevin Dynamics, which gives a general recipe for sampling from constrained distributions using rate-based dynamics. This results in a recurrent circuit model that can accurately infer presence and concentration at scale and can be mapped onto the primary cell types of the olfactory bulb. This frame-work offers a path towards circuit models—for olfactory sensing and beyond—that both perform well in naturalistic environments and make experimentally-testable predictions for neural response dynamics.
Abstract The past decade has observed a significant advancement in AI, with deep learning‐based models being deployed in diverse scenarios… (see more), including safety‐critical applications. As these AI systems become deeply embedded in our societal infrastructure, the repercussions of their decisions and actions have significant consequences, making the ethical implications of AI deployment highly relevant and essential. The ethical concerns associated with AI are multifaceted, including challenging issues of fairness, privacy and data protection, responsibility and accountability, safety and robustness, transparency and explainability, and environmental impact. These principles together form the foundations of ethical AI considerations that concern every stakeholder in the AI system lifecycle. In light of the present ethical and future x‐risk concerns, governments have shown increasing interest in establishing guidelines for the ethical deployment of AI. This work unifies the current and future ethical concerns of deploying AI into society. While we acknowledge and appreciate the technical surveys for each of the ethical principles concerned, in this paper, we aim to provide a comprehensive overview that not only addresses each principle from a technical point of view but also discusses them from a social perspective.
Multiple myeloma (MM) and stem cell transplantation (SCT) significantly impact patients’ quality of life. Virtual reality with hypnosis (V… (see more)RH) has emerged as a promising nonpharmacological intervention to address these challenges, yet data on its acceptability and user experience remain scarce. This study assessed the acceptability and user experience of a VRH intervention among adult patients with MM who had undergone allogeneic SCT. Participants used a VRH application and rated their experience through standardized questionnaires and semistructured interviews. Quantitative data were analyzed descriptively, and qualitative data underwent descriptive content analysis. Findings indicated high patients’ satisfaction, strong perceived relevance, and low cybersickness. Qualitative analysis revealed perceived emotional and psychological benefits. VRH was deemed particularly suitable during hospitalization and treatment periods. This study shows the potential of combining virtual reality and hypnosis for MM patients following SCT. Indeed, they showed high satisfaction levels, paving the way for further studies evaluating the clinical efficacy of such interventions.
Organizations are increasingly adopting and adapting Large Language Models (LLMs) hosted on public repositories such as HuggingFace. Althoug… (see more)h these adaptations often improve performance on specialized downstream tasks, recent evidence indicates that they can also degrade a model's safety or fairness. Since different fine-tuning techniques may exert distinct effects on these critical dimensions, this study undertakes a systematic assessment of their trade-offs. Four widely used Parameter-Efficient Fine-Tuning methods, LoRA, IA3, Prompt-Tuning, and P-Tuning, are applied to four instruction-tuned model families (Meta-Llama-3-8B, Qwen2.5-7B, Mistral-7B, and Gemma-7B). In total, 235 fine-tuned variants are evaluated across eleven safety hazard categories and nine demographic fairness dimensions. The results show that adapter-based approaches (LoRA, IA3) tend to improve safety scores and are the least disruptive to fairness, retaining higher accuracy and lower bias scores. In contrast, prompt-based methods (Prompt-Tuning and P-Tuning) generally reduce safety and cause larger fairness regressions, with decreased accuracy and increased bias. Alignment shifts are strongly moderated by base model type: LLaMA remains stable, Qwen records modest gains, Gemma experiences the steepest safety decline, and Mistral, which is released without an internal moderation layer, displays the greatest variance. Improvements in safety do not necessarily translate into improvements in fairness, and no single configuration optimizes all fairness metrics simultaneously, indicating an inherent trade-off between these objectives. These findings suggest a practical guideline for safety-critical deployments: begin with a well-aligned base model, favour adapter-based PEFT, and conduct category-specific audits of both safety and fairness.
We present WebMMU, a multilingual benchmark that evaluates three core web tasks: (1) website visual question answering, (2) code editing inv… (see more)olving HTML/CSS/JavaScript, and (3) mockup-to-code generation. Unlike prior benchmarks that treat these tasks separately, WebMMU unifies them using expert-annotated, real-world web data to assess models'abilities in complex multi-step reasoning, precise element grounding, and functional UI comprehension and coding. Our evaluation shows that while multimodal large language models (MLLMs) perform well on basic information extraction, they struggle with reasoning and grounding, editing code to preserve functionality, and generating design-to-code that maintains hierarchy and supports multilingual content. These findings reveal key limitations in current MLLMs and underscore the need for improved multimodal and cross-lingual reasoning to build future web agents capable of automating diverse web development tasks.
2025-10-31
Conference on Empirical Methods in Natural Language Processing (published)
Most refrigerants currently used in air-conditioning systems, such as hydrofluorocarbons, are potent greenhouse gases and are being phased d… (see more)own. Large-scale molecular screening has been applied to the search for alternatives, but in practice only about 300 refrigerants are known, and only a few additional candidates have been suggested without experimental validation. This scarcity of reliable data limits the effectiveness of purely data-driven methods. We present Refgen, a generative pipeline that integrates machine learning with physics-grounded inductive biases. Alongside fine-tuning for valid molecular generation, Refgen incorporates predictive models for critical properties, equations of state, thermochemical polynomials, and full vapor compression cycle simulations. These models enable reinforcement learning fine-tuning under thermodynamic constraints, enforcing consistency and guiding discovery toward molecules that balance efficiency, safety, and environmental impact. By embedding physics into the learning process, Refgen leverages scarce data effectively and enables de novo refrigerant discovery beyond the known set of compounds.
2025-10-30
SIMBIOCHEM @ Neural Information Processing Systems (published)