Publications

Retrieval-Enhanced Machine Learning

Hamed Zamani

Mostafa Dehghani

Donald Metzler

Michael Bendersky

Although information access systems have long supportedpeople in accomplishing a wide range of tasks, we propose broadening the scope of use… (see more)rs of information access systems to include task-driven machines, such as machine learning models. In this way, the core principles of indexing, representation, retrieval, and ranking can be applied and extended to substantially improve model generalization, scalability, robustness, and interpretability. We describe a generic retrieval-enhanced machine learning (REML) framework, which includes a number of existing models as special cases. REML challenges information retrieval conventions, presenting opportunities for novel advances in core areas, including optimization. The REML research agenda lays a foundation for a new style of information access research and paves a path towards advancing machine learning and artificial intelligence.

2022-07-06

Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (published)

doi.org

arxiv.org

Offline Retrieval Evaluation Without Evaluation Metrics

Fernando Diaz

Andres Ferraro

Offline evaluation of information retrieval and recommendation has traditionally focused on distilling the quality of a ranking into a scala… (see more)r metric such as average precision or normalized discounted cumulative gain. We can use this metric to compare the performance of multiple systems for the same request. Although evaluation metrics provide a convenient summary of system performance, they also collapse subtle differences across users into a single number and can carry assumptions about user behavior and utility not supported across retrieval scenarios. We propose recall-paired preference (RPP), a metric-free evaluation method based on directly computing a preference between ranked lists. RPP simulates multiple user subpopulations per query and compares systems across these pseudo-populations. Our results across multiple search and recommendation tasks demonstrate that RPP substantially improves discriminative power while correlating well with existing metrics and being equally robust to incomplete data.

2022-07-05

Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (published)

doi.org

arxiv.org

The distribution, ecology and predicted habitat use of the Critically Endangered angelshark (Squatina squatina) in coastal waters of Wales and the central Irish Sea

Joanna Barker

Jake Davies

Monika Goralczyk

Surshti Patel

John O'Connor

Jim Evans

Jackson Wesley Evans

Rowland Sharp

Matthew Gollock

Fenella R. Wood

Frank N. Wood

James Rosindell

Charlie Bartlett

Brett J. Garner

Dafydd Jones

D. J. Jones

Declan Quigley

Ben Wray

Billy Wray

Abstract The angelshark (Squatina squatina) has the northernmost range of any angel shark species, but there is limited information on its d… (see more)istribution, habitat use and ecology at higher latitudes. To address this, Angel Shark Project: Wales gathered 2231 S. squatina records and 142 anecdotal resources from fishers, coastal communities and archives. These spanned the coastal waters of Wales and the central Irish Sea and were dated from 1812 to 2020, with 97.62% of records within 11.1 km (6 nm) of the coast. Commercial, recreational and charter boat fishers provided the majority of S. squatina records (97.18%), with significantly more sightings from three decades (1970s, 1980s and 1990s) and in the months of September, June, August and July (in descending order). The coastal area between Bardsey Island and Strumble Head had the most S. squatina records (n = 1279), with notable concentrations also found in Carmarthen Bay, Conwy Bay and the Outer Severn Estuary. Species distribution models (SDM) identified four environmental variables that had significant influence on S. squatina distribution, depth, chlorophyll‐a concentration, sea surface temperature (SST) and salinity, and these varied between the quarters (Q) of the year. SDM model outputs predicted a larger congruous area of suitable habitat in Q3 (3176 km2) compared to Q2 (2051 km2), with suitability along the three glacial moraines (Sarn Badrig, Sarn‐y‐Bwch and Sarn Cynfelyn) strongly presented. Comparison of modelled environmental variables at the location of S. squatina records for each Q identified reductions in depth and salinity, and increases in chlorophyll‐a and SST when comparing Q2 or Q3 with Q1 or Q4. This shift may suggest S. squatina are making seasonal movements to shallow coastal waters in Q2 and Q3. This is supported by 23 anecdotal resources and may be driven by reproductive behaviour, as there were 85 records of S. squatina individuals ≤60 cm in the dataset, inferred as recently born or juvenile life‐history stages. The results have helped fill significant evidence gaps identified in the Wales Angelshark Action Plan and immediate next research steps are suggested.

2022-07-05

Journal of Fish Biology (published)

doi.org

From Precision Medicine to Precision Convergence for Multilevel Resilience—The Aging Brain and Its Social Isolation

Laurette Dubé

Patricia P. Silveira

Daiva E. Nielsen

Spencer Moore

Catherine Paquet

J. Miguel Cisneros-Franco

Gina Kemp

Bärbel Knauper

Yu Ma

Mehmood Khan

Gillian Bartlett-Esquilant

Alan C. Evans

Lesley K. Fellows

Jorge L. Armony

R. Nathan Spreng

Jian-Yun Nie

Shawn T. Brown

Georg Northoff

Danilo Bzdok

2022-07-04

Frontiers in Public Health (published)

doi.org

Incentivized Security-Aware Computation Offloading for Large-Scale Internet of Things Applications

Talal Halabi

Adel Abusitta

Glaucio H.S. Carvalho

Benjamin C. M. Fung

2022-07-04

2022 7th International Conference on Smart and Sustainable Technologies (SpliTech) (published)

doi.org

Does Pre-training Induce Systematic Inference? How Masked Language Models Acquire Commonsense Knowledge

Ian Porada

Alessandro Sordoni

Jackie CK Cheung

2022-06-30

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (published)

doi.org

arxiv.org

Exploring the roles of artificial intelligence in surgical education: A scoping review

Elif Bilgic

Andrew Gorgy

Alison Yang

Michelle Cwintal

Hamed Ranjbar

Kalin Kahla

Dheeksha Reddy

Kexin Li

Helin Ozturk

Eric Zimmermann

Andrea Quaiattini

S. A. Rahimi

Dan Poenaru

Jason M. Harley

2022-06-30

The American Journal of Surgery (published)

doi.org

IG-RL: Inductive Graph Reinforcement Learning for Massive-Scale Traffic Signal Control

FranÃ§ois-Xavier Devailly

Denis Larocque

Laurent Charlin

Scaling adaptive traffic signal control involves dealing with combinatorial state and action spaces. Multi-agent reinforcement learning atte… (see more)mpts to address this challenge by distributing control to specialized agents. However, specialization hinders generalization and transferability, and the computational graphs underlying neural-network architectures—dominating in the multi-agent setting—do not offer the flexibility to handle an arbitrary number of entities which changes both between road networks, and over time as vehicles traverse the network. We introduce Inductive Graph Reinforcement Learning (IG-RL) based on graph-convolutional networks which adapts to the structure of any road network, to learn detailed representations of traffic signal controllers and their surroundings. Our decentralized approach enables learning of a transferable-adaptive-traffic-signal-control policy. After being trained on an arbitrary set of road networks, our model can generalize to new road networks and traffic distributions, with no additional training and a constant number of parameters, enabling greater scalability compared to prior methods. Furthermore, our approach can exploit the granularity of available data by capturing the (dynamic) demand at both the lane level and the vehicle level. The proposed method is tested on both road networks and traffic settings never experienced during training. We compare IG-RL to multi-agent reinforcement learning and domain-specific baselines. In both synthetic road networks and in a larger experiment involving the control of the 3,971 traffic signals of Manhattan, we show that different instantiations of IG-RL outperform baselines.

2022-06-30

IEEE Transactions on Intelligent Transportation Systems (published)

doi.org

arxiv.org

Leishmania parasites exchange drug-resistance genes through extracellular vesicles

Noélie Douanne

George Dong

Atia Amin

Lorena Bernardo

Mathieu Blanchette

David Langlais

Martin Olivier

Christopher Fernandez-Prada

2022-06-30

Cell Reports (published)

doi.org

Naming Autism in the Right Context

Andres Roman-Urrestarazu

Guillaume Dumas

Varun Warrier

2022-06-30

JAMA Pediatrics (published)

doi.org

R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS

Kyle Kastner

Aaron Courville

2022-06-29

ArXiv (preprint)

doi.org

arxiv.org

A guided multiverse study of neuroimaging analyses

Jessica Dafflon

Pedro F. da Costa

František Váša

Ricardo Pio Monti

Danilo Bzdok

Peter J. Hellyer

Federico Turkheimer

Jonathan Smallwood

Emily Jones

Robert Leech

For most neuroimaging questions the range of possible analytic choices makes it unclear how to evaluate conclusions from any single analytic… (see more) method. One possible way to address this issue is to evaluate all possible analyses using a multiverse approach, however, this can be computationally challenging and sequential analyses on the same data can compromise predictive power. Here, we establish how active learning on a low-dimensional space capturing the inter-relationships between pipelines can efficiently approximate the full spectrum of analyses. This approach balances the benefits of a multiverse analysis without incurring the cost on computational and predictive power. We illustrate this approach with two functional MRI datasets (predicting brain age and autism diagnosis) demonstrating how a multiverse of analyses can be efficiently navigated and mapped out using active learning. Furthermore, our presented approach not only identifies the subset of analysis techniques that are best able to predict age or classify individuals with autism spectrum disorder and healthy controls, but it also allows the relationships between analyses to be quantified.

2022-06-28

Nature Communications (published)

doi.org

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications