NLP in the era of generative AI, cognitive sciences, and societal transformation
Join us at Mila in October for a three-day workshop to explore the transformative potential of language technologies and their implications for society.
This program is designed to provide decision-makers, policymakers and professional working in policy with a foundational understanding of AI technology.
We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Publications
Black-Box Access is Insufficient for Rigorous AI Audits
External audits of AI systems are increasingly recognized as a key mechanism for AI governance. The effectiveness of an audit, however, depe… (see more)nds on the degree of system access granted to auditors. Recent audits of state-of-the-art AI systems have primarily relied on black-box access, in which auditors can only query the system and observe its outputs. However, white-box access to the system's inner workings (e.g., weights, activations, gradients) allows an auditor to perform stronger attacks, more thoroughly interpret models, and conduct fine-tuning. Meanwhile, outside-the-box access to its training and deployment information (e.g., methodology, code, documentation, hyperparameters, data, deployment details, findings from internal evaluations) allows for auditors to scrutinize the development process and design more targeted evaluations. In this paper, we examine the limitations of black-box audits and the advantages of white- and outside-the-box audits. We also discuss technical, physical, and legal safeguards for performing these audits with minimal security risks. Given that different forms of access can lead to very different levels of evaluation, we conclude that (1) transparency regarding the access and methods used by auditors is necessary to properly interpret audit results, and (2) white- and outside-the-box access allow for substantially more scrutiny than black-box access alone.
Ensuring reliable confidence scores from deep neural networks is of paramount significance in critical decision-making systems, particularly… (see more) in real-world domains such as healthcare. Recent literature on calibrating deep segmentation networks has resulted in substantial progress. Nevertheless, these approaches are strongly inspired by the advancements in classification tasks, and thus their uncertainty is usually modeled by leveraging the information of individual pixels, disregarding the local structure of the object of interest. Indeed, only the recent Spatially Varying Label Smoothing (SVLS) approach considers pixel spatial relationships across classes, by softening the pixel label assignments with a discrete spatial Gaussian kernel. In this work, we first present a constrained optimization perspective of SVLS and demonstrate that it enforces an implicit constraint on soft class proportions of surrounding pixels. Furthermore, our analysis shows that SVLS lacks a mechanism to balance the contribution of the constraint with the primary objective, potentially hindering the optimization process. Based on these observations, we propose NACL (Neighbor Aware CaLibration), a principled and simple solution based on equality constraints on the logit values, which enables to control explicitly both the enforced constraint and the weight of the penalty, offering more flexibility. Comprehensive experiments on a wide variety of well-known segmentation benchmarks demonstrate the superior calibration performance of the proposed approach, without affecting its discriminative power. Furthermore, ablation studies empirically show the model agnostic nature of our approach, which can be used to train a wide span of deep segmentation networks.
Dimensionality reduction-based data visualization is pivotal in comprehending complex biological data. The most common methods, such as PHAT… (see more)E, t-SNE, and UMAP, are unsupervised and therefore reflect the dominant structure in the data, which may be independent of expert-provided labels. Here we introduce a supervised data visualization method called RF-PHATE, which integrates expert knowledge for further exploration of the data. RF-PHATE leverages random forests to capture intricate featurelabel relationships. Extracting information from the forest, RF-PHATE generates low-dimensional visualizations that highlight relevant data relationships while disregarding extraneous features. This approach scales to large datasets and applies to classification and regression. We illustrate RF-PHATE’s prowess through three case studies. In a multiple sclerosis study using longitudinal clinical and imaging data, RF-PHATE unveils a sub-group of patients with non-benign relapsingremitting Multiple Sclerosis, demonstrating its aptitude for time-series data. In the context of Raman spectral data, RF-PHATE effectively showcases the impact of antioxidants on diesel exhaust-exposed lung cells, highlighting its proficiency in noisy environments. Furthermore, RF-PHATE aligns established geometric structures with COVID-19 patient outcomes, enriching interpretability in a hierarchical manner. RF-PHATE bridges expert insights and visualizations, promising knowledge generation. Its adaptability, scalability, and noise tolerance underscore its potential for widespread adoption.
We introduce PhotoBot, a framework for fully automated photo acquisition based on an interplay between high-level human language guidance an… (see more)d a robot photographer. We propose to communicate photography suggestions to the user via reference images that are selected from a curated gallery. We leverage a visual language model (VLM) and an object detector to characterize the reference images via textual descriptions and then use a large language model (LLM) to retrieve relevant reference images based on a user's language query through text-based reasoning. To correspond the reference image and the observed scene, we exploit pre-trained features from a vision transformer capable of capturing semantic similarity across marked appearance variations. Using these features, we compute pose adjustments for an RGB-D camera by solving a perspective-n-point (PnP) problem. We demonstrate our approach using a manipulator equipped with a wrist camera. Our user studies show that photos taken by PhotoBot are often more aesthetically pleasing than those taken by users themselves, as measured by human feedback. We also show that PhotoBot can generalize to other reference sources such as paintings.