Portrait of Adriana Romero Soriano

Adriana Romero Soriano

Core Industry Member
Canada CIFAR AI Chair
Adjunct professor, McGill University, School of Computer Science
Research Scientist, Meta AI Research (FAIR)
Research Topics
Computer Vision
Deep Learning
Generative Models

Biography

Adriana Romero-Soriano is a research scientist in the Fundamental AI Research (FAIR) team at Meta, adjunct professor at McGill University, core industry member of Mila – Quebec Artificial Intelligence Institute and a Canada CIFAR AI Chair.

Romero-Soriano’s research lies at the intersection of generative models, computer vision and responsible AI, while her most recent work focuses on improving the quality, controllability, consistency and representation diversity of visual content creation systems.

She received her PhD from the University of Barcelona, where she worked with Carlo Gatta, and then spent two years as a postdoctoral researcher at Mila with Yoshua Bengio.

Current Students

Collaborating researcher - Université de Montréal
PhD - McGill University
Principal supervisor :
PhD - McGill University
Principal supervisor :
PhD - McGill University
Principal supervisor :

Publications

Diet Networks: Thin Parameters for Fat Genomics
Learning tasks such as those involving genomic data often poses a serious challenge: the number of input features can be orders of magnitude… (see more) larger than the number of training examples, making it difficult to avoid overfitting, even when using the known regularization techniques. We focus here on tasks in which the input is a description of the genetic variation specific to a patient, the single nucleotide polymorphisms (SNPs), yielding millions of ternary inputs. Improving the ability of deep learning to handle such datasets could have an important impact in medical research, more specifically in precision medicine, where high-dimensional data regarding a particular patient is used to make predictions of interest. Even though the amount of data for such tasks is increasing, this mismatch between the number of examples and the number of inputs remains a concern. Naive implementations of classifier neural networks involve a huge number of free parameters in their first layer (number of input features times number of hidden units): each input feature is associated with as many parameters as there are hidden units. We propose a novel neural network parametrization which considerably reduces the number of free parameters. It is based on the idea that we can first learn or provide a distributed representation for each input feature (e.g. for each position in the genome where variations are observed in data), and then learn (with another neural network called the parameter prediction network) how to map a feature's distributed representation (based on the feature's identity not its value) to the vector of parameters specific to that feature in the classifier neural network (the weights which link the value of the feature to each of the hidden units). This approach views the problem of producing the parameters associated with each feature as a multi-task learning problem. We show experimentally on a population stratification task of interest to medical studies that the proposed approach can significantly reduce both the number of parameters and the error rate of the classifier.
A Benchmark for Endoluminal Scene Segmentation of Colonoscopy Images
David Vazquez
Jorge Bernal
F. Javier Sánchez
Gloria Fernández-Esparrach
Antonio M. López
Colorectal cancer (CRC) is the third cause of cancer death worldwide. Currently, the standard approach to reduce CRC-related mortality is to… (see more) perform regular screening in search for polyps and colonoscopy is the screening tool of choice. The main limitations of this screening procedure are polyp miss rate and the inability to perform visual assessment of polyp malignancy. These drawbacks can be reduced by designing decision support systems (DSS) aiming to help clinicians in the different stages of the procedure by providing endoluminal scene segmentation. Thus, in this paper, we introduce an extended benchmark of colonoscopy image segmentation, with the hope of establishing a new strong benchmark for colonoscopy image analysis research. The proposed dataset consists of 4 relevant classes to inspect the endoluminal scene, targeting different clinical needs. Together with the dataset and taking advantage of advances in semantic segmentation literature, we provide new baselines by training standard fully convolutional networks (FCNs). We perform a comparative study to show that FCNs significantly outperform, without any further postprocessing, prior results in endoluminal scene segmentation, especially with respect to polyp segmentation and localization.
Diet Networks: Thin Parameters for Fat Genomic
Learning tasks such as those involving genomic data often poses a serious challenge: the number of input features can be orders of magnitude… (see more) larger than the number of training examples, making it difficult to avoid overfitting, even when using the known regularization techniques. We focus here on tasks in which the input is a description of the genetic variation specific to a patient, the single nucleotide polymorphisms (SNPs), yielding millions of ternary inputs. Improving the ability of deep learning to handle such datasets could have an important impact in precision medicine, where high-dimensional data regarding a particular patient is used to make predictions of interest. Even though the amount of data for such tasks is increasing, this mismatch between the number of examples and the number of inputs remains a concern. Naive implementations of classifier neural networks involve a huge number of free parameters in their first layer: each input feature is associated with as many parameters as there are hidden units. We propose a novel neural network parametrization which considerably reduces the number of free parameters. It is based on the idea that we can first learn or provide a distributed representation for each input feature (e.g. for each position in the genome where variations are observed), and then learn (with another neural network called the parameter prediction network) how to map a feature's distributed representation to the vector of parameters specific to that feature in the classifier neural network (the weights which link the value of the feature to each of the hidden units). We show experimentally on a population stratification task of interest to medical studies that the proposed approach can significantly reduce both the number of parameters and the error rate of the classifier.
Theano: A Python framework for fast computation of mathematical expressions
Rami Al-rfou'
Amjad Almahairi
Christof Angermüller
Frédéric Bastien
Justin S. Bayer
A. Belikov
A. Belopolsky
Josh Bleecher Snyder
Paul F. Christiano
Marc-Alexandre Côté
Myriam Côté
Julien Demouth
Sander Dieleman
M'elanie Ducoffe
Ziye Fan
Mathieu Germain
Ian G Goodfellow
Matthew Graham
Balázs Hidasi
Arjun Jain
Kai Jia
Mikhail V. Korobov
Vivek Kulkarni
Pascal Lamblin
Eric Larsen
S. Lee
Simon-mark Lefrancois
J. Livezey
Cory R. Lorenz
Jeremiah L. Lowin
Qianli M. Ma
R. McGibbon
Mehdi Mirza
Alberto Orlandi
Colin Raffel
Daniel Renshaw
Matthew David Rocklin
Markus Dr. Roth
Peter Sadowski
John Salvatier
Jan Schlüter
John D. Schulman
Gabriel Schwartz
Iulian V. Serban
Samira Shabanian
Sigurd Spieckermann
S. Subramanyam
Gijs van Tulder
Sebastian Urban
Dustin J. Webb
M. Willson
Lijun Xue
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficie… (see more)ntly. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.
Theano: A Python framework for fast computation of mathematical expressions
Rami Al-rfou'
Amjad Almahairi
Christof Angermüller
Frédéric Bastien
Justin S. Bayer
A. Belikov
A. Belopolsky
J. Bergstra
Josh Bleecher Snyder
Paul F. Christiano
Marc-Alexandre Côté
Myriam Côté
Julien Demouth
Sander Dieleman
M'elanie Ducoffe
Ziye Fan
Mathieu Germain
Ian J. Goodfellow
Matthew Graham
Balázs Hidasi
Arjun Jain
S'ebastien Jean
Kai Jia
Mikhail V. Korobov
Vivek Kulkarni
Pascal Lamblin
Eric P. Larsen
S. Lee
Simon-mark Lefrancois
J. Livezey
Cory R. Lorenz
Jeremiah L. Lowin
Qianli M. Ma
R. McGibbon
Mehdi Mirza
Alberto Orlandi
Colin Raffel
Daniel Renshaw
Matthew David Rocklin
Markus Dr. Roth
Peter Sadowski
John Salvatier
Jan Schlüter
John D. Schulman
Gabriel Schwartz
Iulian V. Serban
Samira Shabanian
Sigurd Spieckermann
S. Subramanyam
Gijs van Tulder
Joseph P. Turian
Sebastian Urban
Dustin J. Webb
M. Willson
Lijun Xue
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficie… (see more)ntly. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.