Publications

Leveraging Structure Between Environments: Phylogenetic Regularization Incentivizes Disentangled Representations
Elliot Layne
Jason Hartford
Sébastien Lachapelle
Recently, learning invariant predictors across varying environments has been shown to improve the generalization of supervised learning meth… (see more)ods. This line of investigation holds great potential for application to biological problem settings, where data is often naturally heterogeneous. Biological samples often originate from different distributions, or environments. However, in biological contexts, the standard "invariant prediction" setting may not completely fit: the optimal predictor may in fact vary across biological environments. There also exists strong domain knowledge about the relationships between environments, such as the evolutionary history of a set of species, or the differentiation process of cell types. Most work on generic invariant predictors have not assumed the existence of structured relationships between environments. However, this prior knowledge about environments themselves has already been shown to improve prediction through a particular form of regularization applied when learning a set of predictors. In this work, we empirically evaluate whether a regularization strategy that exploits environment-based prior information can be used to learn representations that better disentangle causal factors that generate observed data. We find evidence that these methods do in fact improve the disentanglement of latent embeddings. We also show a setting where these methods can leverage phylogenetic information to estimate the number of latent causal features.
FIXME: synchronize with database! An empirical study of data access self-admitted technical debt
Biruk Asmare Muse
Csaba Nagy
Anthony Cleve
Giuliano Antoniol
Joint Multisided Exposure Fairness for Recommendation
Haolun Wu
Bhaskar Mitra
Chen Ma
Prior research on exposure fairness in the context of recommender systems has focused mostly on disparities in the exposure of individual or… (see more) groups of items to individual users of the system. The problem of how individual or groups of items may be systemically under or over exposed to groups of users, or even all users, has received relatively less attention. However, such systemic disparities in information exposure can result in observable social harms, such as withholding economic opportunities from historically marginalized groups (allocative harm) or amplifying gendered and racialized stereotypes (representational harm). Previously, Diaz et al. developed the expected exposure metric---that incorporates existing user browsing models that have previously been developed for information retrieval---to study fairness of content exposure to individual users. We extend their proposed framework to formalize a family of exposure fairness metrics that model the problem jointly from the perspective of both the consumers and producers. Specifically, we consider group attributes for both types of stakeholders to identify and mitigate fairness concerns that go beyond individual users and items towards more systemic biases in recommendation. Furthermore, we study and discuss the relationships between the different exposure fairness dimensions proposed in this paper, as well as demonstrate how stochastic ranking policies can be optimized towards said fairness goals.
On Natural Language User Profiles for Transparent and Scrutable Recommendation
Filip Radlinski
Krisztian Balog
Lucas Dixon
Ben Wedin
Natural interaction with recommendation and personalized search systems has received tremendous attention in recent years. We focus on the c… (see more)hallenge of supporting people's understanding and control of these systems and explore a fundamentally new way of thinking about representation of knowledge in recommendation and personalization systems. Specifically, we argue that it may be both desirable and possible for algorithms that use natural language representations of users' preferences to be developed. We make the case that this could provide significantly greater transparency, as well as affordances for practical actionable interrogation of, and control over, recommendations. Moreover, we argue that such an approach, if successfully applied, may enable a major step towards systems that rely less on noisy implicit observations while increasing portability of knowledge of one's interests.
Offline Retrieval Evaluation Without Evaluation Metrics
Andres Ferraro
Offline evaluation of information retrieval and recommendation has traditionally focused on distilling the quality of a ranking into a scala… (see more)r metric such as average precision or normalized discounted cumulative gain. We can use this metric to compare the performance of multiple systems for the same request. Although evaluation metrics provide a convenient summary of system performance, they also collapse subtle differences across users into a single number and can carry assumptions about user behavior and utility not supported across retrieval scenarios. We propose recall-paired preference (RPP), a metric-free evaluation method based on directly computing a preference between ranked lists. RPP simulates multiple user subpopulations per query and compares systems across these pseudo-populations. Our results across multiple search and recommendation tasks demonstrate that RPP substantially improves discriminative power while correlating well with existing metrics and being equally robust to incomplete data.
Retrieval-Enhanced Machine Learning
Hamed Zamani
Mostafa Dehghani
Donald Metzler
Michael Bendersky
Although information access systems have long supportedpeople in accomplishing a wide range of tasks, we propose broadening the scope of use… (see more)rs of information access systems to include task-driven machines, such as machine learning models. In this way, the core principles of indexing, representation, retrieval, and ranking can be applied and extended to substantially improve model generalization, scalability, robustness, and interpretability. We describe a generic retrieval-enhanced machine learning (REML) framework, which includes a number of existing models as special cases. REML challenges information retrieval conventions, presenting opportunities for novel advances in core areas, including optimization. The REML research agenda lays a foundation for a new style of information access research and paves a path towards advancing machine learning and artificial intelligence.
From Precision Medicine to Precision Convergence for Multilevel Resilience—The Aging Brain and Its Social Isolation
Laurette Dubé
Patricia P. Silveira
Daiva E. Nielsen
Spencer Moore
Catherine Paquet
J. Miguel Cisneros-Franco
Gina Kemp
Bärbel Knauper
Yu Ma
Mehmood Khan
Gillian Bartlett-Esquilant
Alan C. Evans
Lesley K. Fellows
Jorge L. Armony
R. Nathan Spreng
Jian-Yun Nie
Shawn T. Brown
Georg Northoff
Citation: Dubé L, Silveira PP, Nielsen DE, Moore S, Paquet C, Cisneros-Franco JM, Kemp G, Knauper B, Ma Y, Khan M, Bartlett-Esquilant G, Ev… (see more)ans AC, Fellows LK, Armony JL, Spreng RN, Nie J-Y, Brown ST, Northoff G and Bzdok D (2022) From Precision Medicine to Precision Convergence for Multilevel Resilience—The Aging Brain and Its Social Isolation. Front. Public Health 10:720117. doi: 10.3389/fpubh.2022.720117 From Precision Medicine to Precision Convergence for Multilevel Resilience—The Aging Brain and Its Social Isolation
Incentivized Security-Aware Computation Offloading for Large-Scale Internet of Things Applications
Talal Halabi
Adel Abusitta
Glaucio H.S. Carvalho
Adaptation, Comparison and Practical Implementation of Fairness Schemes in Kidney Exchange Programs
In Kidney Exchange Programs (KEPs), each participating patient is registered together with an incompatible donor. Donors without an incompat… (see more)ible patient can also register. Then, KEPs typically maximize overall patient benefit through donor exchanges. This aggregation of benefits calls into question potential individual patient disparities in terms of access to transplantation in KEPs. Considering solely this utilitarian objective may become an issue in the case where multiple exchange plans are optimal or near-optimal. In fact, current KEP policies are all-or-nothing, meaning that only one exchange plan is determined. Each patient is either selected or not as part of that unique solution. In this work, we seek instead to find a policy that contemplates the probability of patients of being in a solution. To guide the determination of our policy, we adapt popular fairness schemes to KEPs to balance the usual approach of maximizing the utilitarian objective. Different combinations of fairness and utilitarian objectives are modelled as conic programs with an exponential number of variables. We propose a column generation approach to solve them effectively in practice. Finally, we make an extensive comparison of the different schemes in terms of the balance of utility and fairness score, and validate the scalability of our methodology for benchmark instances from the literature.
Does Pre-training Induce Systematic Inference? How Masked Language Models Acquire Commonsense Knowledge
Exploring the roles of artificial intelligence in surgical education: A scoping review
Elif Bilgic
Andrew Gorgy
Alison Yang
Michelle Cwintal
Hamed Ranjbar
Kalin Kahla
Dheeksha Reddy
Kexin Li
Helin Ozturk
Eric Zimmermann
Andrea Quaiattini
Jason M. Harley
IG-RL: Inductive Graph Reinforcement Learning for Massive-Scale Traffic Signal Control
François-Xavier Devailly
Denis Larocque
Scaling adaptive traffic signal control involves dealing with combinatorial state and action spaces. Multi-agent reinforcement learning atte… (see more)mpts to address this challenge by distributing control to specialized agents. However, specialization hinders generalization and transferability, and the computational graphs underlying neural-network architectures—dominating in the multi-agent setting—do not offer the flexibility to handle an arbitrary number of entities which changes both between road networks, and over time as vehicles traverse the network. We introduce Inductive Graph Reinforcement Learning (IG-RL) based on graph-convolutional networks which adapts to the structure of any road network, to learn detailed representations of traffic signal controllers and their surroundings. Our decentralized approach enables learning of a transferable-adaptive-traffic-signal-control policy. After being trained on an arbitrary set of road networks, our model can generalize to new road networks and traffic distributions, with no additional training and a constant number of parameters, enabling greater scalability compared to prior methods. Furthermore, our approach can exploit the granularity of available data by capturing the (dynamic) demand at both the lane level and the vehicle level. The proposed method is tested on both road networks and traffic settings never experienced during training. We compare IG-RL to multi-agent reinforcement learning and domain-specific baselines. In both synthetic road networks and in a larger experiment involving the control of the 3,971 traffic signals of Manhattan, we show that different instantiations of IG-RL outperform baselines.