Publications

Variance Reduction is an Antidote to Byzantines: Better Rates, Weaker Assumptions and Communication Compression as a Cherry on the Top
Eduard Gorbunov
Samuel Horváth
Peter Richtárik
Byzantine-robustness has been gaining a lot of attention due to the growth of the interest in collaborative and federated learning. However,… (voir plus) many fruitful directions, such as the usage of variance reduction for achieving robustness and communication compression for reducing communication costs, remain weakly explored in the field. This work addresses this gap and proposes Byz-VR-MARINA - a new Byzantine-tolerant method with variance reduction and compression. A key message of our paper is that variance reduction is key to fighting Byzantine workers more effectively. At the same time, communication compression is a bonus that makes the process more communication efficient. We derive theoretical convergence guarantees for Byz-VR-MARINA outperforming previous state-of-the-art for general non-convex and Polyak-Lojasiewicz loss functions. Unlike the concurrent Byzantine-robust methods with variance reduction and/or compression, our complexity results are tight and do not rely on restrictive assumptions such as boundedness of the gradients or limited compression. Moreover, we provide the first analysis of a Byzantine-tolerant method supporting non-uniform sampling of stochastic gradients. Numerical experiments corroborate our theoretical findings.
Versatile Energy-Based Models for High Energy Physics
Taoli Cheng
Video Killed the HD-Map: Predicting Multi-Agent Behavior Directly From Aerial Images
Yunpeng Liu
Vasileios Lioutas
Jonathan Wilder Lavington
Matthew Niedoba
Justice Sefas
Setareh Dabiri
Dylan Green
Xiaoxuan Liang
Berend Zwartsenberg
Adam Ścibior
The development of algorithms that learn multi-agent behavioral models using human demonstrations has led to increasingly realistic simulati… (voir plus)ons in the field of autonomous driving. In general, such models learn to jointly predict trajectories for all controlled agents by exploiting road context information such as drivable lanes obtained from manually annotated high-definition (HD) maps. Recent studies show that these models can greatly benefit from increasing the amount of human data available for training. However, the manual annotation of HD maps which is necessary for every new location puts a bottleneck on efficiently scaling up human traffic datasets. We propose an aerial image-based map (AIM) representation that requires minimal annotation and provides rich road context information for traffic agents like pedestrians and vehicles. We evaluate multi-agent trajectory prediction using the AIM by incorporating it into a differentiable driving simulator as an image-texture-based differentiable rendering module. Our results demonstrate competitive multi-agent trajectory prediction performance especially for pedestrians in the scene when using our AIM representation as compared to models trained with rasterized HD maps.
When Do Graph Neural Networks Help with Node Classification: Investigating the Homophily Principle on Node Distinguishability
Sitao Luan
Chenqing Hua
Minkai Xu
Qincheng Lu
Jiaqi Zhu
Xiao-Wen Chang
Jie Fu
Jure Leskovec
Homophily principle, i.e., nodes with the same labels are more likely to be connected, was believed to be the main reason for the performanc… (voir plus)e superiority of Graph Neural Networks (GNNs) over Neural Networks (NNs) on Node Classification (NC) tasks. Recently, people have developed theoretical results arguing that, even though the homophily principle is broken, the advantage of GNNs can still hold as long as nodes from the same class share similar neighborhood patterns [29], which questions the validity of homophily. However, this argument only considers intra-class Node Distinguishability (ND) and ignores inter-class ND, which is insufficient to study the effect of homophily. In this paper, we first demonstrate the aforementioned insufficiency with examples and argue that an ideal situation for ND is to have smaller intra-class ND than inter-class ND. To formulate this idea and have a better understanding of homophily, we propose Contextual Stochastic Block Model for Homophily (CSBM-H) and define two metrics, Probabilistic Bayes Error (PBE) and Expected Negative KL-divergence (ENKL), to quantify ND, through which we can also find how intra- and inter-class ND influence ND together. We visualize the results and give detailed analysis. Through experiments, we verified that the superiority of GNNs is
Willingness to Engage in Shared Decision Making: Impact of an Educational Intervention for Resident Physicians (SDM-FM)
Roland M. Grad
A. Sandhu
Michael Ferrante
Vinita D'souza
Lily Puterman-Salzman
Gabrielle Stevens
G. Elwyn
Workflow Discovery from Dialogues in the Low Data Regime
Amine El hattami
Stefania Raimondo
Issam Hadj Laradji
David Vazquez
Pau Rodriguez
Text-based dialogues are now widely used to solve real-world problems. In cases where solution strategies are already known, they can someti… (voir plus)mes be codified into workflows and used to guide humans or artificial agents through the task of helping clients. We introduce a new problem formulation that we call Workflow Discovery (WD) in which we are interested in the situation where a formal workflow may not yet exist. Still, we wish to discover the set of actions that have been taken to resolve a particular problem. We also examine a sequence-to-sequence (Seq2Seq) approach for this novel task. We present experiments where we extract workflows from dialogues in the Action-Based Conversations Dataset (ABCD). Since the ABCD dialogues follow known workflows to guide agents, we can evaluate our ability to extract such workflows using ground truth sequences of actions. We propose and evaluate an approach that conditions models on the set of possible actions, and we show that using this strategy, we can improve WD performance. Our conditioning approach also improves zero-shot and few-shot WD performance when transferring learned models to unseen domains within and across datasets. Further, on ABCD a modified variant of our Seq2Seq method achieves state-of-the-art performance on related but different problems of Action State Tracking (AST) and Cascading Dialogue Success (CDS) across many evaluation metrics.
"Your child needs surgery": A survey-based evaluation of simulated expert consent conversations by key stakeholders.
Zoe Atsaidis
Stephan Robitaille
Elena Guadagno
Jeffrey Wiseman
Sherif Emil
BARVINN: Arbitrary Precision DNN Accelerator Controlled by a RISC-V CPU
Mohammadhossein Askarihemmat
Sean Wagner
O. Bilaniuk
Yassine Hariri
Yvon Savaria
J. David
We present a DNN accelerator that allows inference at arbitrary precision with dedicated processing elements that are configurable at the bi… (voir plus)t level. Our DNN accelerator has 8 Processing Elements controlled by a RISC-V controller with a combined 8.2 TMACs of computational power when implemented with the recent Alveo U250 FPGA platform. We develop a code generator tool that ingests CNN models in ONNX format and generates an executable com-mand stream for the RISC-V controller. We demonstrate the scalable throughput of our accelerator by running different DNN kernels and models when different quantization levels are selected. Compared to other low precision accelerators, our accelerator provides run time programmability without hardware reconfiguration and can accelerate DNNs with multiple quantization levels, regardless of the target FPGA size. BARVINN is an open source project and it is available at https://github.com/hossein1387/BARVINN.
Simplicity and learning to distinguish arguments from modifiers
Leon Bergen
E. Gibson
How programmers find online learning resources
Deeksha M. Arya
Martin P. Robillard
FaithDial: A Faithful Benchmark for Information-Seeking Dialogue
Nouha Dziri
Ehsan Kamalloo
Sivan Milton
Osmar Zaiane
Mo Yu
Edoardo Ponti
Abstract The goal of information-seeking dialogue is to respond to seeker queries with natural language utterances that are grounded on know… (voir plus)ledge sources. However, dialogue systems often produce unsupported utterances, a phenomenon known as hallucination. To mitigate this behavior, we adopt a data-centric solution and create FaithDial, a new benchmark for hallucination-free dialogues, by editing hallucinated responses in the Wizard of Wikipedia (WoW) benchmark. We observe that FaithDial is more faithful than WoW while also maintaining engaging conversations. We show that FaithDial can serve as training signal for: i) a hallucination critic, which discriminates whether an utterance is faithful or not, and boosts the performance by 12.8 F1 score on the BEGIN benchmark compared to existing datasets for dialogue coherence; ii) high-quality dialogue generation. We benchmark a series of state-of-the-art models and propose an auxiliary contrastive objective that achieves the highest level of faithfulness and abstractiveness based on several automated metrics. Further, we find that the benefits of FaithDial generalize to zero-shot transfer on other datasets, such as CMU-Dog and TopicalChat. Finally, human evaluation reveals that responses generated by models trained on FaithDial are perceived as more interpretable, cooperative, and engaging.
Post-hoc Interpretability for Neural NLP: A Survey
Andreas Madsen