Publications

Retrieval-Enhanced Machine Learning
Hamed Zamani
Mostafa Dehghani
Donald Metzler
Michael Bendersky
Although information access systems have long supportedpeople in accomplishing a wide range of tasks, we propose broadening the scope of use… (see more)rs of information access systems to include task-driven machines, such as machine learning models. In this way, the core principles of indexing, representation, retrieval, and ranking can be applied and extended to substantially improve model generalization, scalability, robustness, and interpretability. We describe a generic retrieval-enhanced machine learning (REML) framework, which includes a number of existing models as special cases. REML challenges information retrieval conventions, presenting opportunities for novel advances in core areas, including optimization. The REML research agenda lays a foundation for a new style of information access research and paves a path towards advancing machine learning and artificial intelligence.
Offline Retrieval Evaluation Without Evaluation Metrics
Offline evaluation of information retrieval and recommendation has traditionally focused on distilling the quality of a ranking into a scala… (see more)r metric such as average precision or normalized discounted cumulative gain. We can use this metric to compare the performance of multiple systems for the same request. Although evaluation metrics provide a convenient summary of system performance, they also collapse subtle differences across users into a single number and can carry assumptions about user behavior and utility not supported across retrieval scenarios. We propose recall-paired preference (RPP), a metric-free evaluation method based on directly computing a preference between ranked lists. RPP simulates multiple user subpopulations per query and compares systems across these pseudo-populations. Our results across multiple search and recommendation tasks demonstrate that RPP substantially improves discriminative power while correlating well with existing metrics and being equally robust to incomplete data.
The distribution, ecology and predicted habitat use of the Critically Endangered angelshark (Squatina squatina) in coastal waters of Wales and the central Irish Sea
Joanna Barker
Jake Davies
Monika Goralczyk
Surshti Patel
John O'Connor
Jim Evans
Jackson Wesley Evans
Rowland Sharp
Matthew Gollock
Fenella R. Wood
Frank N. Wood
James Rosindell
Charlie Bartlett
Brett J. Garner
Dafydd Jones
D. J. Jones
Declan Quigley
Ben Wray
Billy Wray
Abstract The angelshark (Squatina squatina) has the northernmost range of any angel shark species, but there is limited information on its d… (see more)istribution, habitat use and ecology at higher latitudes. To address this, Angel Shark Project: Wales gathered 2231 S. squatina records and 142 anecdotal resources from fishers, coastal communities and archives. These spanned the coastal waters of Wales and the central Irish Sea and were dated from 1812 to 2020, with 97.62% of records within 11.1 km (6 nm) of the coast. Commercial, recreational and charter boat fishers provided the majority of S. squatina records (97.18%), with significantly more sightings from three decades (1970s, 1980s and 1990s) and in the months of September, June, August and July (in descending order). The coastal area between Bardsey Island and Strumble Head had the most S. squatina records (n = 1279), with notable concentrations also found in Carmarthen Bay, Conwy Bay and the Outer Severn Estuary. Species distribution models (SDM) identified four environmental variables that had significant influence on S. squatina distribution, depth, chlorophyll‐a concentration, sea surface temperature (SST) and salinity, and these varied between the quarters (Q) of the year. SDM model outputs predicted a larger congruous area of suitable habitat in Q3 (3176 km2) compared to Q2 (2051 km2), with suitability along the three glacial moraines (Sarn Badrig, Sarn‐y‐Bwch and Sarn Cynfelyn) strongly presented. Comparison of modelled environmental variables at the location of S. squatina records for each Q identified reductions in depth and salinity, and increases in chlorophyll‐a and SST when comparing Q2 or Q3 with Q1 or Q4. This shift may suggest S. squatina are making seasonal movements to shallow coastal waters in Q2 and Q3. This is supported by 23 anecdotal resources and may be driven by reproductive behaviour, as there were 85 records of S. squatina individuals ≤60 cm in the dataset, inferred as recently born or juvenile life‐history stages. The results have helped fill significant evidence gaps identified in the Wales Angelshark Action Plan and immediate next research steps are suggested.
From Precision Medicine to Precision Convergence for Multilevel Resilience—The Aging Brain and Its Social Isolation
Laurette Dubé
Patricia P. Silveira
Daiva E. Nielsen
Spencer Moore
Catherine Paquet
J. Miguel Cisneros-Franco
Gina Kemp
Bärbel Knauper
Yu Ma
Mehmood Khan
Gillian Bartlett-Esquilant
Alan C. Evans
Lesley K. Fellows
Jorge L. Armony
R. Nathan Spreng
Jian-Yun Nie
Shawn T. Brown
Georg Northoff
Incentivized Security-Aware Computation Offloading for Large-Scale Internet of Things Applications
Talal Halabi
Adel Abusitta
Glaucio H.S. Carvalho
Benjamin C. M. Fung
Does Pre-training Induce Systematic Inference? How Masked Language Models Acquire Commonsense Knowledge
Jackie CK Cheung
Exploring the roles of artificial intelligence in surgical education: A scoping review
Elif Bilgic
Andrew Gorgy
Alison Yang
Michelle Cwintal
Hamed Ranjbar
Kalin Kahla
Dheeksha Reddy
Kexin Li
Helin Ozturk
Andrea Quaiattini
S. A. Rahimi
Jason M. Harley
IG-RL: Inductive Graph Reinforcement Learning for Massive-Scale Traffic Signal Control
François-Xavier Devailly
Denis Larocque
Scaling adaptive traffic signal control involves dealing with combinatorial state and action spaces. Multi-agent reinforcement learning atte… (see more)mpts to address this challenge by distributing control to specialized agents. However, specialization hinders generalization and transferability, and the computational graphs underlying neural-network architectures—dominating in the multi-agent setting—do not offer the flexibility to handle an arbitrary number of entities which changes both between road networks, and over time as vehicles traverse the network. We introduce Inductive Graph Reinforcement Learning (IG-RL) based on graph-convolutional networks which adapts to the structure of any road network, to learn detailed representations of traffic signal controllers and their surroundings. Our decentralized approach enables learning of a transferable-adaptive-traffic-signal-control policy. After being trained on an arbitrary set of road networks, our model can generalize to new road networks and traffic distributions, with no additional training and a constant number of parameters, enabling greater scalability compared to prior methods. Furthermore, our approach can exploit the granularity of available data by capturing the (dynamic) demand at both the lane level and the vehicle level. The proposed method is tested on both road networks and traffic settings never experienced during training. We compare IG-RL to multi-agent reinforcement learning and domain-specific baselines. In both synthetic road networks and in a larger experiment involving the control of the 3,971 traffic signals of Manhattan, we show that different instantiations of IG-RL outperform baselines.
Leishmania parasites exchange drug-resistance genes through extracellular vesicles
Noélie Douanne
George Dong
Atia Amin
Lorena Bernardo
David Langlais
Martin Olivier
Christopher Fernandez-Prada
Naming Autism in the Right Context
Andres Roman-Urrestarazu
Varun Warrier
R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS
A guided multiverse study of neuroimaging analyses
Jessica Dafflon
Pedro F. da Costa
František Váša
Ricardo Pio Monti
Peter J. Hellyer
Federico Turkheimer
Jonathan Smallwood
Emily Jones
Robert Leech
For most neuroimaging questions the range of possible analytic choices makes it unclear how to evaluate conclusions from any single analytic… (see more) method. One possible way to address this issue is to evaluate all possible analyses using a multiverse approach, however, this can be computationally challenging and sequential analyses on the same data can compromise predictive power. Here, we establish how active learning on a low-dimensional space capturing the inter-relationships between pipelines can efficiently approximate the full spectrum of analyses. This approach balances the benefits of a multiverse analysis without incurring the cost on computational and predictive power. We illustrate this approach with two functional MRI datasets (predicting brain age and autism diagnosis) demonstrating how a multiverse of analyses can be efficiently navigated and mapped out using active learning. Furthermore, our presented approach not only identifies the subset of analysis techniques that are best able to predict age or classify individuals with autism spectrum disorder and healthy controls, but it also allows the relationships between analyses to be quantified.