Learn how to leverage generative AI to support and improve your productivity at work. The next cohort will take place online on April 28 and 30, 2026, in French.
We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Publications
What Are They Doing? Joint Audio-Speech Co-Reasoning
In audio and speech processing, tasks usually focus on either the audio or speech modality, even when both sounds and human speech are prese… (see more)nt in the same audio clip. Recent Auditory Large Language Models (ALLMs) have made it possible to process audio and speech simultaneously within a single model, leading to further considerations of joint audio-speech tasks.
In this paper, we establish a novel benchmark to investigate how well ALLMs can perform joint audio-speech processing. Specifically, we introduce Joint Audio-Speech Co-Reasoning (JASCO), a novel task that unifies audio and speech processing, strictly requiring co-reasoning across both modalities. We also release a scene-reasoning dataset called "What Are They Doing". Additionally, we provide deeper insights into the models' behaviors by analyzing their dependence on each modality.
2025-04-05
ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (published)
Degenerative cervical myelopathy (DCM) is the most common cause of spinal dysfunction globally. Despite surgical intervention, motor dysfunc… (see more)tion may persist in many patients. The purpose of this study was to comprehensively examine specific spinal cord tract changes in patients with DCM, to better understand potential substrates for compensatory recovery of function.
Cervical spinal cord MRI scans with diffusion tensor imaging were performed in patients with DCM and in healthy volunteers. Spinal Cord Toolbox was used to register the PAM50 template, which includes a probabilistic atlas of the white matter tracts of the spinal cord, to the imaging data. Fractional anisotropy (FA) was extracted for each tract at C3 above the level of maximal compression and compared between patients with DCM and healthy volunteers and between patients with mild vs moderate to severe DCM.
We included 25 patients with DCM (13 mild and 12 moderate to severe) and 6 healthy volunteers. FA was significantly reduced in DCM subjects relative to healthy volunteers for the lateral corticospinal tract (mild DCM vs healthy ∆ = −0.13, P = .018; moderate to severe DCM vs healthy ∆ = −0.11, P = .047), fasciculus gracilis (mild DCM vs healthy ∆ = −0.16, P = .010; moderate to severe DCM vs healthy ∆ = −0.13, P = .039), and fasciculus cuneatus (mild DCM vs healthy ∆ = −0.16, P = .007; moderate to severe DCM vs healthy ∆ = −0.15, P = .012). There were no differences in FA for all tracts between mild and moderate-to-severe DCM subjects.
Patients with DCM had altered diffusion tensor imaging signal in their lateral corticospinal tract, fasciculus gracilis, and fasciculus cuneatus in comparison with healthy volunteers. These findings indicate that DCM is characterized by injury to these structures, which suggests that other tracts within the cord could potentially act as substrates for compensatory motor recovery.
Large Reasoning Models like DeepSeek-R1 mark a fundamental shift in how LLMs approach complex problems. Instead of directly producing an ans… (see more)wer for a given input, DeepSeek-R1 creates detailed multi-step reasoning chains, seemingly"thinking"about a problem before providing an answer. This reasoning process is publicly available to the user, creating endless opportunities for studying the reasoning behaviour of the model and opening up the field of Thoughtology. Starting from a taxonomy of DeepSeek-R1's basic building blocks of reasoning, our analyses on DeepSeek-R1 investigate the impact and controllability of thought length, management of long or confusing contexts, cultural and safety concerns, and the status of DeepSeek-R1 vis-\`a-vis cognitive phenomena, such as human-like language processing and world modelling. Our findings paint a nuanced picture. Notably, we show DeepSeek-R1 has a 'sweet spot' of reasoning, where extra inference time can impair model performance. Furthermore, we find a tendency for DeepSeek-R1 to persistently ruminate on previously explored problem formulations, obstructing further exploration. We also note strong safety vulnerabilities of DeepSeek-R1 compared to its non-reasoning counterpart, which can also compromise safety-aligned LLMs.