Portrait of Aditi Khandelwal is unavailable

Aditi Khandelwal

PhD - McGill University
Supervisor
Co-supervisor
Research Topics
AI and Culture
AI Ethics
Natural Language Processing
Responsible AI

Publications

DeepSeek-R1 Thoughtology: Let's think about LLM Reasoning
DeepSeek-R1 Thoughtology: Let's think about LLM Reasoning
Large Reasoning Models like DeepSeek-R1 mark a fundamental shift in how LLMs approach complex problems. Instead of directly producing an ans… (see more)wer for a given input, DeepSeek-R1 creates detailed multi-step reasoning chains, seemingly"thinking"about a problem before providing an answer. This reasoning process is publicly available to the user, creating endless opportunities for studying the reasoning behaviour of the model and opening up the field of Thoughtology. Starting from a taxonomy of DeepSeek-R1's basic building blocks of reasoning, our analyses on DeepSeek-R1 investigate the impact and controllability of thought length, management of long or confusing contexts, cultural and safety concerns, and the status of DeepSeek-R1 vis-\`a-vis cognitive phenomena, such as human-like language processing and world modelling. Our findings paint a nuanced picture. Notably, we show DeepSeek-R1 has a 'sweet spot' of reasoning, where extra inference time can impair model performance. Furthermore, we find a tendency for DeepSeek-R1 to persistently ruminate on previously explored problem formulations, obstructing further exploration. We also note strong safety vulnerabilities of DeepSeek-R1 compared to its non-reasoning counterpart, which can also compromise safety-aligned LLMs.
UNLEARNING GEO-CULTURAL STEREOTYPES IN MULTILINGUAL LLMS
As multilingual generative models become more widely used, most safety and fairness evaluation techniques still focus on English-language re… (see more)sources, while overlooking important cross-cultural factors. This limitation raises concerns about fairness and safety, particularly regarding geoculturally situated stereotypes that hinder the models’ global inclusivity. In this work, we present preliminary findings on the impact of stereotype unlearning across languages, specifically in English, French, and Hindi. Using an adapted version of the SeeGULL dataset, we analyze how unlearning stereotypes in one language influences other languages within multilingual large language models. Our study evaluates two model families, Llama-3.1-8B and Aya-Expanse-8B, to assess whether unlearning in one linguistic context transfers across languages, potentially mitigating or exacerbating biases in multilingual settings.