Learn how to leverage generative AI to support and improve your productivity at work. The next cohort will take place online on April 28 and 30, 2026, in French.
We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
A central question in the study of language change is whether or not such change is generational. If a language changes over time generation… (see more)-by-generation, the process looks as follows: New generations of speakers introduce innovations, while older speakers conserve their usage patterns, and the language changes as new generations replace older ones. At the opposite extreme, language change could be a zeitgeist phenomenon, in which changes are universally adopted by speakers simultaneously, regardless of age or generational cohort. This paper asks this question in the context of word meaning change. We analyze meaning change in over 100 words across more than 7.9 million U.S. congressional speeches, to observe whether, when a word sense rises or falls in prominence, adult speakers from different generations uniformly adopt it, or those from older generations conserve their prior usage. Using language model-based word sense induction methods, we identify different senses of each word, and then model the prevalence of each of these word senses as a function of time and speaker age. We find that most words show a small but statistically significant effect of speaker age; across almost 140 y of Congress, older speakers typically take longer than younger speakers to follow changes in word usage, but nevertheless do so within a few years. Our findings indicate that despite minor age-based differences, word meaning change among mature speakers is likely not a generational process, but rather a zeitgeist process, in which older adult speakers can readily adopt new word usage patterns.
2025-07-27
Proceedings of the National Academy of Sciences (published)
Large Reasoning Models like DeepSeek-R1 mark a fundamental shift in how LLMs approach complex problems. Instead of directly producing an ans… (see more)wer for a given input, DeepSeek-R1 creates detailed multi-step reasoning chains, seemingly"thinking"about a problem before providing an answer. This reasoning process is publicly available to the user, creating endless opportunities for studying the reasoning behaviour of the model and opening up the field of Thoughtology. Starting from a taxonomy of DeepSeek-R1's basic building blocks of reasoning, our analyses on DeepSeek-R1 investigate the impact and controllability of thought length, management of long or confusing contexts, cultural and safety concerns, and the status of DeepSeek-R1 vis-\`a-vis cognitive phenomena, such as human-like language processing and world modelling. Our findings paint a nuanced picture. Notably, we show DeepSeek-R1 has a 'sweet spot' of reasoning, where extra inference time can impair model performance. Furthermore, we find a tendency for DeepSeek-R1 to persistently ruminate on previously explored problem formulations, obstructing further exploration. We also note strong safety vulnerabilities of DeepSeek-R1 compared to its non-reasoning counterpart, which can also compromise safety-aligned LLMs.
Sentences containing multiple semantic operators with overlapping scope often create ambiguities in interpretation, known as scope ambiguiti… (see more)es. These ambiguities offer rich insights into the interaction between semantic structure and world knowledge in language processing. Despite this, there has been little research into how modern large language models treat them. In this paper, we investigate how different versions of certain autoregressive language models -- GPT-2, GPT-3/3.5, Llama 2 and GPT-4 -- treat scope ambiguous sentences, and compare this with human judgments. We introduce novel datasets that contain a joint total of almost 1,000 unique scope-ambiguous sentences, containing interactions between a range of semantic operators, and annotated for human judgments. Using these datasets, we find evidence that several models (i) are sensitive to the meaning ambiguity in these sentences, in a way that patterns well with human judgments, and (ii) can successfully identify human-preferred readings at a high level of accuracy (over 90% in some cases).