The Mila AI Policy Fellowship translates deep AI expertise into rigorous, public-interest policy. Read the newest publication Bridging the Expertise Gap: Knowledge Transfer Mechanisms for AI Regulation by Moritz von Knebel
This program supports AI startups at any time of the year. Benefit from cutting-edge resources and tailored support to accelerate your technology's development.
We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Prognostic data extraction harnessing a privacy-preserving large language model: a clinician-AI collaborative retrospective evaluation in head and neck oncology
Privacy regulations and limited expert-validation constrain the deployment of large language models (LLMs) for electronic health record stru… (see more)cturing. We evaluated locally deployed LLMs to extract 30 prognostic variables from 1,360 head and neck cancer reports (882 patients) using zero-shot prompting. A stratified 50-case subset was reviewed by three radiation oncologists (50 cases, 30 fields, 3 reviewers; 4,500 decisions) to form a majority-vote reference for Llama3.3-70B, which achieved 98.6% F1 with high clinician agreement and processed reports in 53 s/report. Among seven additional models (2.6B-70B) benchmarked against this reference, GPT-OSS-20.9B (F1 89.4%) and MedGemma-27B (F1 88.5%) performed best. Integrating LLM-extracted HPV status, smoking history, and Charlson Comorbidity Score into a multivariate Cox Proportional Hazards model (age, sex, T/N stage) improved disease-free survival (likelihood ratio test p = 0.014; ΔC-index + 0.071) and locoregional failure-free survival (p = 0.026; ΔC-index + 0.108) with 1,000-bootstrap internal validation. This clinician-AI collaborative evaluation shows that on-premises LLMs enable privacy-preserving and efficient tumour board support, longitudinal data curation, and outcome prediction.
While neural networks are capable of achieving human-like performance in many tasks such as image classification, the impressive performance… (see more) of each model is limited to its own dataset. Source-free domain adaptation (SFDA) was introduced to address knowledge transfer between different domains in the absence of source data, thus, increasing data privacy. Diversity in representation space can be vital to a model`s adaptability in varied and difficult domains. In unsupervised SFDA, the diversity is limited to learning a single hypothesis on the source or learning multiple hypotheses with a shared feature extractor. Motivated by the improved predictive performance of ensembles, we propose a novel unsupervised SFDA algorithm that promotes representational diversity through the use of separate feature extractors with Distinct Backbone Architectures (DBA). Although diversity in feature space is increased, the unconstrained mutual information (MI) maximization may potentially introduce amplification of weak hypotheses. Thus we introduce the Weak Hypothesis Penalization (WHP) regularizer as a mitigation strategy. Our work proposes Penalized Diversity (PD) where the synergy of DBA and WHP is applied to unsupervised source-free domain adaptation for covariate shift. In addition, PD is augmented with a weighted MI maximization objective for label distribution shift. Empirical results on natural, synthetic, and medical domains demonstrate the effectiveness of PD under different distributional shifts.
With neural networks applied to safety-critical applications, it has become increasingly important to understand the defining features of de… (see more)cision-making. Therefore, the need to uncover the black boxes to rational representational space of these neural networks is apparent. Concept bottleneck model (CBM) encourages interpretability by predicting human-understandable concepts. They predict concepts from input images and then labels from concepts. Test time intervention, a salient feature of CBM, allows for human-model interactions. However, these interactions are prone to information leakage and can often be ineffective inappropriate communication with humans. We propose a novel uncertainty based strategy, \emph{SIUL: Single Interventional Uncertainty Learning} to select the interventions. Additionally, we empirically test the robustness of CBM and the effect of SIUL interventions under adversarial attack and distributional shift. Using SIUL, we observe that the interventions suggested lead to meaningful corrections along with mitigation of concept leakage. Extensive experiments on three vision datasets along with a histopathology dataset validate the effectiveness of our interventional learning.