Publications

Dissociable brain structural asymmetry patterns reveal unique phenome-wide profiles

Karin Saltoun

Ralph Adolphs

Lynn K. Paul

Vaibhav Sharma

Joern Diedrichsen

B. T. Thomas Yeo

Danilo Bzdok

2022-11-06

Nature Human Behaviour (inconnu)

Object-centric causal representation learning

Amin Mansouri

Jason Hartford

Kartik Ahuja

2022-11-06

NeurIPS.cc/2022/Workshop/NeurReps (poster)

openreview.net

PaReco: patched clones and missed patches among the divergent variants of a software family

Poedjadevie Kadjel Ramkisoen

John Businge

Brent van Bladel

Alexandre Decan

Serge Demeyer

Coen De Roover

Foutse Khomh

Re-using whole repositories as a starting point for new projects is often done by maintaining a variant fork parallel to the original. Howev… (voir plus)er, the common artifacts between both are not always kept up to date. As a result, patches are not optimally integrated across the two repositories, which may lead to sub-optimal maintenance between the variant and the original project. A bug existing in both repositories can be patched in one but not the other (we see this as a missed opportunity) or it can be manually patched in both probably by different developers (we see this as effort duplication). In this paper we present a tool (named PaReCo) which relies on clone detection to mine cases of missed opportunity and effort duplication from a pool of patches. We analyzed 364 (source to target) variant pairs with 8,323 patches resulting in a curated dataset containing 1,116 cases of effort duplication and 1,008 cases of missed opportunities. We achieve a precision of 91%, recall of 80%, accuracy of 88%, and F1-score of 85%. Furthermore, we investigated the time interval between patches and found out that, on average, missed patches in the target variants have been introduced in the source variants 52 weeks earlier. Consequently, PaReCo can be used to manage variability in “time” by automatically identifying interesting patches in later project releases to be backported to supported earlier releases.

2022-11-06

ESEC/SIGSOFT FSE (publié)

Laurence Perreault-Levasseur

Posterior samples of source galaxies in strong gravitational lenses with score-based priors

Alexandre Adam

Adam Coogan

Nikolay Malkin

Ronan Legin

Yashar Hezaveh

Inferring accurate posteriors for high-dimensional representations of the brightness of gravitationally-lensed sources is a major challenge,… (voir plus) in part due to the difficulties of accurately quantifying the priors. Here, we report the use of a score-based model to encode the prior for the inference of undistorted images of background galaxies. This model is trained on a set of high-resolution images of undistorted galaxies. By adding the likelihood score to the prior score and using a reverse-time stochastic differential equation solver, we obtain samples from the posterior. Our method produces independent posterior samples and models the data almost down to the noise level. We show how the balance between the likelihood and the prior meet our expectations in an experiment with out-of-distribution data.

2022-11-06

ArXiv (prépublication)

Bayesian learning of Causal Structure and Mechanisms with GFlowNets and Variational Bayes

Mizu Nishikawa-Toomey

Tristan Deleu

Jithendaraa Subramanian

Laurent Charlin

Bayesian causal structure learning aims to learn a posterior distribution over directed acyclic graphs (DAGs), and the mechanisms that defin… (voir plus)e the relationship between parent and child variables. By taking a Bayesian approach, it is possible to reason about the uncertainty of the causal model. The notion of modelling the uncertainty over models is particularly crucial for causal structure learning since the model could be unidentifiable when given only a finite amount of observational data. In this paper, we introduce a novel method to jointly learn the structure and mechanisms of the causal model using Variational Bayes, which we call Variational Bayes-DAG-GFlowNet (VBG). We extend the method of Bayesian causal structure learning using GFlowNets to learn not only the posterior distribution over the structure, but also the parameters of a linear-Gaussian model. Our results on simulated data suggest that VBG is competitive against several baselines in modelling the posterior over DAGs and mechanisms, while offering several advantages over existing methods, including the guarantee to sample acyclic graphs, and the flexibility to generalize to non-linear causal mechanisms.

2022-11-03

arXiv (prépublication)

Spectral Regularization: an Inductive Bias for Sequence Modeling

Kaiwen Hou

Guillaume Rabusseau

Hou Rabusseau

2022-11-03

ArXiv (prépublication)

Adult neurogenesis acts as a neural regularizer

Lina M. Tran

Adam Santoro

Lulu Liu

Sheena A. Josselyn

Blake A. Richards

Paul W. Frankland

New neurons are continuously generated in the subgranular zone of the dentate gyrus throughout adulthood. These new neurons gradually integr… (voir plus)ate into hippocampal circuits, forming new naive synapses. Viewed from this perspective, these new neurons may represent a significant source of “wiring” noise in hippocampal networks. In machine learning, such noise injection is commonly used as a regularization technique. Regularization techniques help prevent overfitting training data and allow models to generalize learning to new, unseen data. Using a computational modeling approach, here we ask whether a neurogenesis-like process similarly acts as a regularizer, facilitating generalization in a category learning task. In a convolutional neural network (CNN) trained on the CIFAR-10 object recognition dataset, we modeled neurogenesis as a replacement/turnover mechanism, where weights for a randomly chosen small subset of hidden layer neurons were reinitialized to new values as the model learned to categorize 10 different classes of objects. We found that neurogenesis enhanced generalization on unseen test data compared to networks with no neurogenesis. Moreover, neurogenic networks either outperformed or performed similarly to networks with conventional noise injection (i.e., dropout, weight decay, and neural noise). These results suggest that neurogenesis can enhance generalization in hippocampal learning through noise injection, expanding on the roles that neurogenesis may have in cognition.

2022-11-01

Proceedings of the National Academy of Sciences of the United States of America (publié)

Automatic measure and normalization of spinal cord cross-sectional area using the pontomedullary junction

Sandrine Bédard

Julien Cohen‐Adad

Spinal cord cross-sectional area (CSA) is a relevant biomarker to assess spinal cord atrophy in neurodegenerative diseases. However, the con… (voir plus)siderable inter-subject variability among healthy participants currently limits its usage. Previous studies explored factors contributing to the variability, yet the normalization models required manual intervention and used vertebral levels as a reference, which is an imprecise prediction of the spinal levels. In this study we implemented a method to measure CSA automatically from a spatial reference based on the central nervous system (the pontomedullary junction, PMJ), we investigated factors to explain variability, and developed normalization strategies on a large cohort (N = 804). Following automatic spinal cord segmentation, vertebral labeling and PMJ labeling, the spinal cord CSA was computed on T1w MRI scans from the UK Biobank database. The CSA was computed using two methods. For the first method, the CSA was computed at the level of the C2–C3 intervertebral disc. For the second method, the CSA was computed at 64 mm caudally from the PMJ, this distance corresponding to the average distance between the PMJ and the C2–C3 disc across all participants. The effect of various demographic and anatomical factors was explored, and a stepwise regression found significant predictors; the coefficients of the best fit model were used to normalize CSA. CSA measured at C2–C3 disc and using the PMJ differed significantly (paired t-test, p-value = 0.0002). The best normalization model included thalamus, brain volume, sex and the interaction between brain volume and sex. The coefficient of variation went down for PMJ CSA from 10.09 (without normalization) to 8.59%, a reduction of 14.85%. For CSA at C2–C3, it went down from 9.96 to 8.42%, a reduction of 15.13 %. This study introduces an end-to-end automatic pipeline to measure and normalize cord CSA from a neurological reference. This approach requires further validation to assess atrophy in longitudinal studies. The inter-subject variability of CSA can be partly accounted for by demographics and anatomical factors.

2022-11-01

Frontiers in Neuroimaging (publié)

A General-Purpose Neural Architecture for Geospatial Systems

Nasim Rahaman

Martin Weiss

Frederik Träuble

Francesco Locatello

Alexandre Lacoste

Chris Pal

Li Erran Li

Bernhard Schölkopf

2022-11-01

OpenReview (inconnu)

openreview.net

Active Keyword Selection to Track Evolving Topics on Twitter

Sacha Lévy

Farimah Poursafaei

Kellin Pelrine

Reihaneh Rabbany

How can we study social interactions on evolving topics at a mass scale? Over the past decade, researchers from diverse fields such as econo… (voir plus)mics, political science, and public health have often done this by querying Twitter's public API endpoints with hand-picked topical keywords to search or stream discussions. However, despite the API's accessibility, it remains difficult to select and update keywords to collect high-quality data relevant to topics of interest. In this paper, we propose an active learning method for rapidly refining query keywords to increase both the yielded topic relevance and dataset size. We leverage a large open-source COVID-19 Twitter dataset to illustrate the applicability of our method in tracking Tweets around the key sub-topics of Vaccine, Mask, and Lockdown. Our experiments show that our method achieves an average topic-related keyword recall 2x higher than baselines. We open-source our code along with a web interface for keyword selection to make data collection from Twitter more systematic for researchers.

2022-10-31

2022 IEEE International Conference on Data Mining Workshops (ICDMW) (publié)

Consistent Training via Energy-Based GFlowNets for Modeling Discrete Joint Distributions

Chanakya Ajit Ekbote

Moksh J. Jain

Payel Das