Publications

Maximal Initial Learning Rates in Deep ReLU Networks
Gaurav Iyer
Boris Hanin
Training a neural network requires choosing a suitable learning rate, which involves a trade-off between speed and effectiveness of converge… (see more)nce. While there has been considerable theoretical and empirical analysis of how large the learning rate can be, most prior work focuses only on late-stage training. In this work, we introduce the maximal initial learning rate
Neural FIM for learning Fisher Information Metrics from point cloud data
Oluwadamilola Fasina
Guillaume Huguet
Alexander Tong
Yanlei Zhang
Maximilian Nickel
Ian Adelstein
Smita Krishnaswamy
Although data diffusion embeddings are ubiquitous in unsupervised learning and have proven to be a viable technique for uncovering the under… (see more)lying intrinsic geometry of data, diffusion embeddings are inherently limited due to their discrete nature. To this end, we propose neural FIM, a method for computing the Fisher information metric (FIM) from point cloud data - allowing for a continuous manifold model for the data. Neural FIM creates an extensible metric space from discrete point cloud data such that information from the metric can inform us of manifold characteristics such as volume and geodesics. We demonstrate Neural FIM's utility in selecting parameters for the PHATE visualization method as well as its ability to obtain information pertaining to local volume illuminating branching points and cluster centers embeddings of a toy dataset and two single-cell datasets of IPSC reprogramming and PBMCs (immune cells).
ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts
Minghao Xu
Xinyu Yuan
Santiago Miret
Current protein language models (PLMs) learn protein representations mainly based on their sequences, thereby well capturing co-evolutionary… (see more) information, but they are unable to explicitly acquire protein functions, which is the end goal of protein representation learning. Fortunately, for many proteins, their textual property descriptions are available, where their various functions are also described. Motivated by this fact, we first build the ProtDescribe dataset to augment protein sequences with text descriptions of their functions and other important properties. Based on this dataset, we propose the ProtST framework to enhance Protein Sequence pre-training and understanding by biomedical Texts. During pre-training, we design three types of tasks, i.e., unimodal mask prediction, multimodal representation alignment and multimodal mask prediction, to enhance a PLM with protein property information with different granularities and, at the same time, preserve the PLM's original representation power. On downstream tasks, ProtST enables both supervised learning and zero-shot prediction. We verify the superiority of ProtST-induced PLMs over previous ones on diverse representation learning benchmarks. Under the zero-shot setting, we show the effectiveness of ProtST on zero-shot protein classification, and ProtST also enables functional protein retrieval from a large-scale database without any function annotation.
Robust Perception through Equivariance
Chengzhi Mao
Lingyu Zhang
Abhishek Vaibhav Joshi
Junfeng Yang
Hao Wang
Carl Vondrick
R-U-SURE? Uncertainty-Aware Code Suggestions By Maximizing Utility Across Random User Intents
Daniel D. Johnson
Danny Tarlow
Christian Walder
Large language models show impressive results at predicting structured text such as code, but also commonly introduce errors and hallucinati… (see more)ons in their output. When used to assist software developers, these models may make mistakes that users must go back and fix, or worse, introduce subtle bugs that users may miss entirely. We propose Randomized Utility-driven Synthesis of Uncertain REgions (R-U-SURE), an approach for building uncertainty-aware suggestions based on a decision-theoretic model of goal-conditioned utility, using random samples from a generative model as a proxy for the unobserved possible intents of the end user. Our technique combines minimum-Bayes-risk decoding, dual decomposition, and decision diagrams in order to efficiently produce structured uncertainty summaries, given only sample access to an arbitrary generative model of code and an optional AST parser. We demonstrate R-U-SURE on three developer-assistance tasks, and show that it can be applied different user interaction patterns without retraining the model and leads to more accurate uncertainty estimates than token-probability baselines. We also release our implementation as an open-source library at https://github.com/google-research/r_u_sure.
Sampling-Based Accuracy Testing of Posterior Estimators for General Inference
Target-based Surrogates for Stochastic Optimization
Jonathan Wilder Lavington
Sharan Vaswani
Reza Babanezhad Harikandeh
Mark Schmidt
We consider minimizing functions for which it is expensive to compute the gradient. Such functions are prevalent in reinforcement learning, … (see more)imitation learning and bilevel optimization. Our target optimization framework uses the (expensive) gradient computation to construct surrogate functions in a \emph{target space} (e.g. the logits output by a linear model for classification) that can be minimized efficiently. This allows for multiple parameter updates to the model, amortizing the cost of gradient computation. In the full-batch setting, we prove that our surrogate is a global upper-bound on the loss, and can be (locally) minimized using a black-box optimization algorithm. We prove that the resulting majorization-minimization algorithm ensures convergence to a stationary point of the loss. Next, we instantiate our framework in the stochastic setting and propose the
Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features
Aleksandr Beznosikov
David Dobre
The Frank-Wolfe (FW) method is a popular approach for solving optimization problems with structured constraints that arise in machine learni… (see more)ng applications. In recent years, stochastic versions of FW have gained popularity, motivated by large datasets for which the computation of the full gradient is prohibitively expensive. In this paper, we present two new variants of the FW algorithms for stochastic finite-sum minimization. Our algorithms have the best convergence guarantees of existing stochastic FW approaches for both convex and non-convex objective functions. Our methods do not have the issue of permanently collecting large batches, which is common to many stochastic projection-free approaches. Moreover, our second approach does not require either large batches or full deterministic gradients, which is a typical weakness of many techniques for finite-sum problems. The faster theoretical rates of our approaches are confirmed experimentally.
Environmental Scan of Existing Digital Health Solutions for Older Adults Living with Neurocognitive Disorders (Mild and Major) and Their Informal Caregivers: Summary Report
Ambily Jose
Maxime Sasseville
Ellen Gorus
Anik Giguère
Anne Bourbonnais
Ronald Buyl
Marie-Pierre Gagnon
: Digital health has added numerous promising solutions to enhance the health and wellness of people living with dementia and other cognitiv… (see more)e problems and their informal caregivers. This work aims to summarize currently available digital health solutions and their related characteristics to develop a decision support tool for older adults living with mild or major neurocognitive disorders and their informal caregivers. We conducted an environmental scan to identify digital health solutions from a systematic review and targeted searches for grey literature covering the regions of Canada and Europe. Technological tools were scanned based on a preformatted extraction grid. We assessed their relevance based on selected attributes. We identified 100 available digital health solutions. The majority (56%) were not specific to dementia. Only 28% provided scientific evidence of their effectiveness. Remote patient care, movement tracking and cognitive exercises were the most common purposes of digital health solutions. Most solutions were presented as mobility aid tools, pill dispensers, apps, web, or a combination of these platforms. This knowledge will inform the development of a decision support tool to assist older adults and their informal caregivers in their search for adequate eHealth solutions according to their needs and preferences, based on trustable information.
An exploratory cross-sectional study of the effects of ongoing relationships with accompanying patients on cancer care experience, self-efficacy, and psychological distress
Marie-Pascale Pomey
Monica Iliescu Nelea
Louise Normandin
Cécile Vialaron
Karine Bouchard
Marie‐Andrée Côté
Maria Alejandra Rodriguez Duarte
Djahanchah Philip Ghadiri
Israël Fortin
Danielle Charpentier
Mélanie Lavoie-Tremblay
Nicolas Fernandez
Antoine Boivin
Michel Dorval
Mado Desforges
Isabelle Ganache
Lynda Bélanger
Zeev Rosberger
Michel Alain Danino … (see 3 more)
Jean-François Pelletier
Thi Trinh Thuc Vu
Michèle de Guise
SSS3D: Fast Neural Architecture Search For Efficient Three-Dimensional Semantic Segmentation
Olivier Therrien
Marihan Amein
Zhuoran Xiong
Brett Meyer
We present SSS3D, a fast multi-objective NAS framework designed to find computationally efficient 3D semantic scene segmentation networks. I… (see more)t uses RandLA-Net, an off-the-shelf point-based network, as a super-network to enable weight sharing and reduce search time by 99.67% for single-stage searches. SSS3D has a complex search space composed of sampling and architectural parameters that can form 2.88 * 10^17 possible networks. To further reduce search time, SSS3D splits the complete search space and introduces a two-stage search that finds optimal subnetworks in 54% of the time required by single-stage searches.
The Flag and the Cross: White Christian Nationalism and the Threat to American Democracy by Philip S. Gorski and Samuel L. Perry (review)