Publications

Tackling bias in AI health datasets through the STANDING Together initiative

Shaswath Ganapathi

Johannes Palmer

J. Alderman

Melanie Calvert

Cyrus Espinoza

Jacqui Gath

Marzyeh Ghassemi

Katherine Heller

Francis McKay

Alan Karthikesalingam

S. Kuku

Maxine E. Mackintosh

Sinduja Manohar

Bilal Mateen

Rubeta Matin

Melissa D. McCradden

Lauren Oakden-Rayner

Johan Ordish

Russell Pearson

S. Pfohl … (see 8 more)

Negar Rostamzadeh

Elizabeth Sapey

Neil J. Sebire

Viknesh Sounderajah

Charlotte Summers

Darren E. Treanor

Alastair Denniston

Xiaoxuan Liu

2022-09-25

Nature Network Boston (published)

doi.org

The 5-year longitudinal diagnostic profile and health services utilization of patients treated with electroconvulsive therapy in Quebec: a population-based study

Simon Lafrenière

Fatemeh Gholi-Zadeh-Kharrat

Caroline Sirois

Victoria Massamba

Louis Rochette

Camille Brousseau-Paradis

Simon Patry

Christian Gagné

Morgane Lemasson

Geneviève Gariépy

Chantal Mérette

Elham Rahme

Alain Lesage

2022-09-25

Social Psychiatry and Psychiatric Epidemiology (published)

doi.org

OSSEM: One-Shot Speaker Adaptive Speech Enhancement Using Meta Learning

Cheng Yu

Szu-wei Fu

Tsun-An Hsieh

Yu Tsao

Mirco Ravanelli

Although deep learning (DL) has achieved notable progress in speech enhancement (SE), further research is still required for a DL-based SE s… (see more)ystem to adapt effectively and efficiently to particular speakers. In this study, we propose a novel meta-learning-based speaker-adaptive SE approach (called OSSEM) that aims to achieve SE model adaptation in a one-shot manner. OSSEM consists of a modified transformer SE network and a speaker-specific masking (SSM) network. In practice, the SSM network takes an enrolled speaker embedding extracted using ECAPA-TDNN to adjust the input noisy feature through masking. To evaluate OSSEM, we designed a modified Voice Bank-DEMAND dataset, in which one utterance from the testing set was used for model adaptation, and the remaining utterances were used for testing the performance. Moreover, we set restrictions allowing the enhancement process to be conducted in real time, and thus designed OSSEM to be a causal SE system. Experimental results first show that OSSEM can effectively adapt a pretrained SE model to a particular speaker with only one utterance, thus yielding improved SE results. Meanwhile, OSSEM exhibits a competitive performance compared to state-of-the-art causal SE systems.

2022-09-17

Interspeech 2022 (published)

doi.org

arxiv.org

SoundChoice: Grapheme-to-Phoneme Models with Semantic Disambiguation

Artem Ploujnikov

Mirco Ravanelli

End-to-end speech synthesis models directly convert the input characters into an audio representation (e.g., spectrograms). Despite their im… (see more)pressive performance, such models have difficulty disambiguating the pronunciations of identically spelled words. To mitigate this issue, a separate Grapheme-to-Phoneme (G2P) model can be employed to convert the characters into phonemes before synthesizing the audio. This paper proposes SoundChoice, a novel G2P architecture that processes entire sentences rather than operating at the word level. The proposed architecture takes advantage of a weighted homograph loss (that improves disambiguation), exploits curriculum learning (that gradually switches from word-level to sentence-level G2P), and integrates word embeddings from BERT (for further performance improvement). Moreover, the model inherits the best practices in speech recognition, including multi-task learning with Connectionist Temporal Classification (CTC) and beam search with an embedded language model. As a result, SoundChoice achieves a Phoneme Error Rate (PER) of 2.65% on whole-sentence transcription using data from LibriSpeech and Wikipedia. Index Terms grapheme-to-phoneme, speech synthesis, text-tospeech, phonetics, pronunciation, disambiguation.

2022-09-17

Interspeech 2022 (published)

doi.org

arxiv.org

Intervertebral Disc Labeling With Learning Shape Information, A Look Once Approach

Reza Azad

Moein Heidari

Julien Cohen-Adad

Ehsan Adeli

Dorit Merhof

Accurate and automatic segmentation of intervertebral discs from medical images is a critical task for the assessment of spine-related disea… (see more)ses such as osteoporosis, vertebral fractures, and intervertebral disc herniation. To date, various approaches have been developed in the literature which routinely relies on detecting the discs as the primary step. A disadvantage of many cohort studies is that the localization algorithm also yields false-positive detections. In this study, we aim to alleviate this problem by proposing a novel U-Net-based structure to predict a set of candidates for intervertebral disc locations. In our design, we integrate the image shape information (image gradients) to encourage the model to learn rich and generic geometrical information. This additional signal guides the model to selectively emphasize the contextual representation and suppress the less discriminative features. On the post-processing side, to further decrease the false positive rate, we propose a permutation invariant 'look once' model, which accelerates the candidate recovery procedure. In comparison with previous studies, our proposed approach does not need to perform the selection in an iterative fashion. The proposed method was evaluated on the spine generic public multi-center dataset and demonstrated superior performance compared to previous work. We have provided the implementation code in https://github.com/rezazad68/intervertebral-lookonce

2022-09-15

Predictive Intelligence in Medicine (published)

doi.org

arxiv.org

Video Game Bad Smells: What They Are and How Developers Perceive Them

Vittoria Nardone

Biruk Asmare Muse

Mouna Abidi

Foutse Khomh

Massimiliano Di Penta

Video games represent a substantial and increasing share of the software market. However, their development is particularly challenging as i… (see more)t requires multi-faceted knowledge, which is not consolidated in computer science education yet. This article aims at defining a catalog of bad smells related to video game development. To achieve this goal, we mined discussions on general-purpose and video game-specific forums. After querying such a forum, we adopted an open coding strategy on a statistically significant sample of 572 discussions, stratified over different forums. As a result, we obtained a catalog of 28 bad smells, organized into five categories, covering problems related to game design and logic, physics, animation, rendering, or multiplayer. Then, we assessed the perceived relevance of such bad smells by surveying 76 game development professionals. The survey respondents agreed with the identified bad smells but also provided us with further insights about the discussed smells. Upon reporting results, we discuss bad smell examples, their consequences, as well as possible mitigation/fixing strategies and trade-offs to be pursued by developers. The catalog can be used not only as a guideline for developers and educators but also can pave the way toward better automated tool support for video game developers.

2022-09-14

ACM Transactions on Software Engineering and Methodology (published)

doi.org

Accurate machine learning prediction of sexual orientation based on brain morphology and intrinsic functional connectivity

Benjamin Clemens

Jeremy Lefort-Besnard

Christoph Ritter

Elke Smith

Mikhail Votinov

Birgit Derntl

Ute Habel

Danilo Bzdok

Sexual orientation in humans represents a multilevel construct that is grounded in both neurobiological and environmental factors. Here, we… (see more) bring to bear a machine learning approach to predict sexual orientation from gray matter volumes (GMVs) or resting-state functional connectivity (RSFC) in a cohort of 45 heterosexual and 41 homosexual participants. In both brain assessments, we used penalized logistic regression models and nonparametric permutation. We found an average accuracy of 62% (±6.72) for predicting sexual orientation based on GMV and an average predictive accuracy of 92% (±9.89) using RSFC. Regions in the precentral gyrus, precuneus and the prefrontal cortex were significantly informative for distinguishing heterosexual from homosexual participants in both the GMV and RSFC settings. These results indicate that, aside from self-reports, RSFC offers neurobiological information valuable for highly accurate prediction of sexual orientation. We demonstrate for the first time that sexual orientation is reflected in specific patterns of RSFC, which enable personalized, brain-based predictions of this highly complex human trait. While these results are preliminary, our neurobiologically based prediction framework illustrates the great value and potential of RSFC for revealing biologically meaningful and generalizable predictive patterns in the human brain.

2022-09-13

Cerebral Cortex (unknown)

doi.org

Designing Biological Sequences via Meta-Reinforcement Learning and Bayesian Optimization

The ability to accelerate the design of biological sequences can have a substantial impact on the progress of the medical field. The problem… (see more) can be framed as a global optimization problem where the objective is an expensive black-box function such that we can query large batches restricted with a limitation of a low number of rounds. Bayesian Optimization is a principled method for tackling this problem. However, the astronomically large state space of biological sequences renders brute-force iterating over all possible sequences infeasible. In this paper, we propose MetaRLBO where we train an autoregressive generative model via Meta-Reinforcement Learning to propose promising sequences for selection via Bayesian Optimization. We pose this problem as that of finding an optimal policy over a distribution of MDPs induced by sampling subsets of the data acquired in the previous rounds. Our in-silico experiments show that meta-learning over such ensembles provides robustness against reward misspecification and achieves competitive results compared to existing strong baselines.

2022-09-12

ArXiv (preprint)

doi.org

arxiv.org

Measuring Commonality in Recommendation of Cultural Content: Recommender Systems to Enhance Cultural Citizenship

Andres Ferraro

Gustavo Ferreira

Fernando Diaz

Georgina Born

2022-09-12

Proceedings of the 16th ACM Conference on Recommender Systems (published)

doi.org

arxiv.org

Towards Fair Federated Recommendation Learning: Characterizing the Inter-Dependence of System and Data Heterogeneity

Kiwan Maeng

Haiyu Lu

Luca Melis

John Nguyen

Michael G. Rabbat

Carole-Jean Wu

2022-09-12

Proceedings of the 16th ACM Conference on Recommender Systems (published)

doi.org

arxiv.org

Rapidly Inferring Personalized Neurostimulation Parameters with Meta-Learning: A Case Study of Individualized Fiber Recruitment in Vagus Nerve Stimulation

Ximeng Mao

Yao-Chuan Chang

Stavros Zanos

Guillaume Lajoie

Our meta-learning framework is general and can be adapted to many input-response neurostimulation mapping problems. Moreover, this method le… (see more)verages information from growing data sets of past patients, as a treatment is deployed. It can also be combined with several model types, including regression, Gaussian processes with Bayesian optimization, and beyond.

2022-09-07

bioRxiv (preprint)

doi.org

Unifying Generative Models with GFlowNets

Dinghuai Zhang

Ricky T. Q. Chen

Nikolay Malkin

Yoshua Bengio

There are many frameworks for deep generative modeling, each often presented with their own specific training algorithms and inference metho… (see more)ds. Here, we demonstrate the connections between existing deep generative models and the recently introduced GFlowNet framework, a probabilistic inference machine which treats sampling as a decision-making process. This analysis sheds light on their overlapping traits and provides a unifying viewpoint through the lens of learning with Markovian trajectories. Our framework provides a means for unifying training and inference algorithms, and provides a route to shine a unifying light over many generative models. Beyond this, we provide a practical and experimentally verified recipe for improving generative modeling with insights from the GFlowNet perspective.

2022-09-05

ArXiv (preprint)

doi.org

arxiv.org

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Publications

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Popular keywords:

Publications