Portrait of Golnoosh Farnadi

Golnoosh Farnadi

Core Academic Member
Canada CIFAR AI Chair
Assistant Professor, McGill University, School of Computer Science
Adjunct Professor, Université de Montréal, Department of Computer Science and Operations Research
Visiting Faculty Researcher, Google
Research Topics
Deep Learning
Generative Models

Biography

Golnoosh Farnadi is an assistant professor at the School of Computer Science, McGill University, and an adjunct professor at Université de Montréal. She is a core academic member of Mila – Quebec Artificial Intelligence Institute and holds a Canada CIFAR AI Chair.

Farnadi founded and is a principal investigator of the EQUAL lab at Mila / McGill University. The EQUAL lab (EQuity & EQuality Using AI and Learning algorithms) is a cutting-edge research laboratory dedicated to advancing the fields of algorithmic fairness and responsible AI.

Current Students

PhD - HEC Montréal
Postdoctorate - McGill University
PhD - McGill University
Co-supervisor :
Master's Research - McGill University
Co-supervisor :
Master's Research - Université de Montréal
Principal supervisor :
Collaborating researcher - UWindsor
PhD - McGill University
Co-supervisor :
Collaborating researcher - McGill University
Collaborating Alumni - Université de Montréal
Collaborating researcher - McGill University
Research Intern - McGill University
Independent visiting researcher - McGill University university
Research Intern - McGill University
PhD - McGill University
Co-supervisor :
Postdoctorate - McGill University
PhD - Université de Montréal
Co-supervisor :
Master's Research - McGill University

Publications

Intrinsic Meets Extrinsic Fairness: Assessing the Downstream Impact of Bias Mitigation in Large Language Models
Large Language Models (LLMs) are increasingly deployed in sensitive domains such as finance, where intrinsic representational biases can pro… (see more)pagate into extrinsic harms in downstream tasks. High-stakes applications such as credit scoring are especially vulnerable, as biased model behavior can reinforce existing inequities and result in harmful disparities across demographic groups \cite{blodgett2020language}. While prior research has questioned whether intrinsic bias truly translates into extrinsic unfairness \cite{goldfarb2020intrinsic}, this connection remains poorly understood. To address this gap, we propose a four-stage evaluation framework that systematically examines the relationship between intrinsic and extrinsic fairness. In Stage 1, we establish a baseline by training models such as logistic regression, LLM embeddings, and fine-tuned classifiers without any mitigation strategy, providing reference points for fairness and accuracy. In Stage 2, we evaluate task-level mitigation through Counterfactual Data Augmentation (CDA) \cite{gallegos2024bias}, which balances gender representation by generating counterfactual training instances, allowing us to assess improvements in extrinsic fairness. In Stage 3, we adapt concept unlearning \cite{dige2024mitigating} as an intrinsic bias mitigation method, encouraging LLMs to forget socioeconomic stereotypes while preserving fluency and predictive utility, and we evaluate how this intervention impacts downstream fairness. Finally, in Stage 4, we combine CDA with unlearning to test whether dual mitigation further enhances fairness. We conduct experiments on three datasets (Adult Census Income, ACS Employment, and German Credit) using instruction-tuned LLMs (LLaMA-3.1, Phi-3, and Gemma-2) in both frozen embedding and fine-tuned classifier settings, evaluating performance with predictive accuracy and group fairness metrics, including Demographic Parity, Accuracy Parity, and Equality of Odds. Our experiments demonstrate that intrinsic bias mitigation through unlearning is highly effective; in Phi-3, for instance, it reduces gender socioeconomic stereotype gaps by 94.9\% while maintaining language fluency. In downstream tasks, unlearning consistently improves group fairness metrics while preserving predictive accuracy, whereas CDA primarily enhances demographic parity but can introduce accuracy trade-offs. For instance, on the ACS Employment dataset, unlearned Gemma-2 improved Accuracy Parity from 0.199 to 0.104 (48\% gain), and combining CDA with unlearning on Llama-3.1 reduced Demographic Parity from 0.080 to 0.014 (82\% gain). On the Adult dataset, all three models maintained accuracy above 0.82 while showing reduced fairness gaps, and on German Credit, unlearning consistently outperformed CDA by improving group fairness metrics without sacrificing predictive performance. Overall, CDA and unlearning exhibit complementary effects, with their combination yielding the strongest fairness improvements across models and datasets. This work contributes to bias mitigation and fairness in LLMs in two ways. First, we adapt concept unlearning to mitigate socioeconomic stereotyping, showing that intrinsic bias reduction improves both representational and downstream fairness. Second, we introduce a unified evaluation framework that links intrinsic and extrinsic fairness, enabling systematic comparison of mitigation strategies. The framework is flexible, applying to both fine-tuned and frozen LLMs, and offers actionable guidance for deploying fairer models in finance and other high-stakes domains.
Neither Valid Nor Reliable? Investigating the Use of LLMs as Judges
Reasoning with Preference Constraints: A Benchmark for Language Models in Many-to-One Matching Markets
Say It Another Way: Auditing LLMs with a User-Grounded Automated Paraphrasing Framework
Cléa Chataigner
Rebecca Ma
Elliot Creager
Towards Democratizing LLMs: Investigating Multilingual Mixture-of-Experts Models
Adaptation, Comparison and Practical Implementation of Fairness Schemes in Kidney Exchange Programs
In Kidney Exchange Programs (KEPs), each participating patient is registered together with an incompatible donor. Donors without an incompat… (see more)ible patient can also register. Then, KEPs typically maximize overall patient benefit through donor exchanges. This aggregation of benefits calls into question potential individual patient disparities in terms of access to transplantation in KEPs. Considering solely this utilitarian objective may become an issue in the case where multiple exchange plans are optimal or near-optimal. In fact, current KEP policies are all-or-nothing, meaning that only one exchange plan is determined. Each patient is either selected or not as part of that unique solution. In this work, we seek instead to find a policy that contemplates the probability of patients of being in a solution. To guide the determination of our policy, we adapt popular fairness schemes to KEPs to balance the usual approach of maximizing the utilitarian objective. Different combinations of fairness and utilitarian objectives are modelled as conic programs with an exponential number of variables. We propose a column generation approach to solve them effectively in practice. Finally, we make an extensive comparison of the different schemes in terms of the balance of utility and fairness score, and validate the scalability of our methodology for benchmark instances from the literature.
Hallucination Detox: Sensitivity Dropout (SenD) for Large Language Model Training
Low-Rank Adaptation Secretly Imitates Differentially Private SGD
As pre-trained language models grow in size, full fine-tuning their parameters on task adaptation data becomes increasingly impractical. To … (see more)address this challenge, some methods for low-rank adaptation of language models have been proposed, e.g. LoRA, which incorporates trainable low-rank decomposition matrices into only some parameters of the pre-trained model, called adapters. This approach significantly reduces the number of trainable parameters compared to fine-tuning all parameters or adapters. In this work, we look at low-rank adaptation method from the lens of data privacy. We show theoretically that the low-rank adaptation used in LoRA is equivalent to fine-tuning adapters with noisy batch gradients - just like what DPSGD algorithm does. We also quantify the variance of the injected noise as a decreasing function of adaptation rank. By establishing a Berry-Esseen type bound on the total variation distance between the injected noise distribution and a Gaussian noise distribution with the same variance, we show that the dynamics of low-rank adaptation is very close to when DPSGD is performed w.r.t the adapters. Following our theoretical findings and approved by our experimental results, we show that low-rank adaptation provides robustness to membership inference attacks w.r.t the fine-tuning data.
Algorithmic Fairness Through the Lens of Metrics and Evaluation (AFME) 2024
Miriam Rateike
Awa Dieng
Jamelle Watson-Daniels
Ferdinando Fioretto
Different Horses for Different Courses: Comparing Bias Mitigation Algorithms in ML
Prakhar Ganeesh
Usman Gohar
Lu Cheng
With fairness concerns gaining significant attention in Machine Learning (ML), several bias mitigation techniques have been proposed, often … (see more)compared against each other to find the best method. These benchmarking efforts tend to use a common setup for evaluation under the assumption that providing a uniform environment ensures a fair comparison. However, bias mitigation techniques are sensitive to hyperparameter choices, random seeds, feature selection, etc., meaning that comparison on just one setting can unfairly favour certain algorithms. In this work, we show significant variance in fairness achieved by several algorithms and the influence of the learning pipeline on fairness scores. We highlight that most bias mitigation techniques can achieve comparable performance, given the freedom to perform hyperparameter optimization, suggesting that the choice of the evaluation parameters-rather than the mitigation technique itself-can sometimes create the perceived superiority of one method over another. We hope our work encourages future research on how various choices in the lifecycle of developing an algorithm impact fairness, and trends that guide the selection of appropriate algorithms.
Multilingual Hallucination Gaps
Cléa Chataigner
Privacy-Preserving Group Fairness in Cross-Device Federated Learning
Sikha Pentyala
Nicola Neophytou
Anderson Nascimento
Martine De Cock
Group fairness ensures that the outcome of machine learning (ML) based decision making systems are notbiased towards a certain group of peop… (see more)le defined by a sensitive attribute such as gender or ethnicity. Achievinggroup fairness in Federated Learning (FL) is challenging because mitigating bias inherently requires usingthe sensitive attribute values of all clients, while FL is aimed precisely at protecting privacy by not givingaccess to the clients’ data. As we show in this paper, this conflict between fairness and privacy in FL can beresolved by combining FL with Secure Multiparty Computation (MPC) and Differential Privacy (DP). Tothis end, we propose a privacy-preserving approach to calculate group fairness notions in the cross-device FLsetting. Then, we propose two bias mitigation pre-processing and post-processing techniques in cross-deviceFL under formal privacy guarantees, without requiring the clients to disclose their sensitive attribute values.Empirical evaluations on real world datasets demonstrate the effectiveness of our solution to train fair andaccurate ML models in federated cross-device setups with privacy guarantees to the users.