Publications

Evaluating Montréal’s harm reduction interventions for people who inject drugs: protocol for observational study and cost-effectiveness analysis
Dimitra Panagiotoglou
Michal Abrahamowicz
J Jaime Caro
Eric Latimer
Mathieu Maheu-Giroux
Erin C Strumpf
High-Throughput and Energy-Efficient VLSI Architecture for Ordered Reliability Bits GRAND
Syed Mohsin Abbas
Thibaud Tonnellier
Furkan Ercan
Marwan Jalaleddine
Ultrareliable low-latency communication (URLLC), a major 5G new-radio (NR) use case, is the key enabler for applications with strict reliabi… (voir plus)lity and latency requirements. These applications necessitate the use of short-length and high-rate channel codes. Guessing random additive noise decoding (GRAND) is a recently proposed maximum likelihood (ML) decoding technique for these short-length and high-rate codes. Rather than decoding the received vector, GRAND tries to infer the noise that corrupted the transmitted codeword during transmission through the communication channel. As a result, GRAND can decode any code, structured or unstructured. GRAND has hard-input as well as soft-input variants. Among these variants, ordered reliability bits GRAND (ORBGRAND) is a soft-input variant that outperforms hard-input GRAND and is suitable for parallel hardware implementation. This work reports the first hardware architecture for ORBGRAND, which achieves an average throughput of up to 42.5 Gb/s for a code length of 128 at a target frame error rate (FER) of 10−7. Furthermore, the proposed hardware can be used to decode any code as long as the length and rate constraints are met. In comparison to the GRAND with ABandonment (GRANDAB), a hard-input variant of GRAND, the proposed architecture enhances decoding performance by at least 2 dB. When compared to the state-of-the-art fast dynamic successive cancellation flip decoder (Fast-DSCF) using a 5G polar code (PC) (128, 105), the proposed ORBGRAND VLSI implementation has
Rademacher Random Projections with Tensor Networks
Beheshteh T. Rakhshan
Random projection (RP) have recently emerged as popular techniques in the machine learning community for their ability in reducing the dimen… (voir plus)sion of very high-dimensional tensors. Following the work in [30], we consider a tensorized random projection relying on Tensor Train (TT) decomposition where each element of the core tensors is drawn from a Rademacher distribution. Our theoretical results reveal that the Gaussian low-rank tensor represented in compressed form in TT format in [30] can be replaced by a TT tensor with core elements drawn from a Rademacher distribution with the same embedding size. Experiments on synthetic data demonstrate that tensorized Rademacher RP can outperform the tensorized Gaussian RP studied in [30]. In addition, we show both theoretically and experimentally, that the tensorized RP in the Matrix Product Operator (MPO) format is not a Johnson-Lindenstrauss transform (JLT) and therefore not a well-suited random projection map
Generating GitHub Repository Descriptions: A Comparison of Manual and Automated Approaches
Jazlyn Hellman
Eunbee Jang
Christoph Treude
Chenzhun Huang
Given the vast number of repositories hosted on GitHub, project discovery and retrieval have become increasingly important for GitHub users.… (voir plus) Repository descriptions serve as one of the first points of contact for users who are accessing a repository. However, repository owners often fail to provide a high-quality description; instead, they use vague terms, the purpose of the repository is poorly explained, or the description is omitted entirely. In this work, we examine the current practice of writing GitHub repository descriptions. Our investigation leads to the proposal of the LSP (Language, Software technology, and Purpose) template to formulate good descriptions for GitHub repositories that are clear, concise, and informative. To understand the extent to which current automated techniques can support generating repository descriptions, we compare the performance of state-of-the-art text summarization methods on this task. Finally, our user study with GitHub users reveals that automated summarization can adequately be used for default description generation for GitHub repositories, while the descriptions which follow the LSP template offer the most effective instrument for communicating with GitHub users.
Stringency of containment and closures on the growth of SARS-CoV-2 in Canada prior to accelerated vaccine roll-out
D. Vickers
S. Baral
Sharmistha Mishra
J. Kwong
M. Sundaram
Alan W. Katz
Andrew J. Calzavara
Mathieu Maheu-Giroux
Tyler Williamson
Real-M: Towards Speech Separation on Real Mixtures
Samuele Cornell
François Grondin
In recent years, deep learning based source separation has achieved impressive results. Most studies, however, still evaluate separation mod… (voir plus)els on synthetic datasets, while the performance of state-of-the-art techniques on in-the-wild speech data remains an open question. This paper contributes to fill this gap in two ways. First, we release the REAL-M dataset, a crowd-sourced corpus of real-life mixtures. Secondly, we address the problem of performance evaluation of real-life mixtures, where the ground truth is not available. We bypass this issue by carefully designing a blind Scale-Invariant Signal-to-Noise Ratio (SI-SNR) neural estimator. Through a user study, we show that our estimator reliably evaluates the separation performance on real mixtures, i.e. we observe that the performance predictions of the SI-SNR estimator correlate well with human opinions. Moreover, when evaluating popular speech separation models, we observe that the performance trends predicted by our estimator on the REAL-M dataset closely follow the performance trends achieved on synthetic benchmarks.
An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models
Nicholas Meade
Elinor Poole-Dayan
Recent work has shown pre-trained language models capture social biases from the large amounts of text they are trained on. This has attract… (voir plus)ed attention to developing techniques that mitigate such biases. In this work, we perform an empirical survey of five recently proposed bias mitigation techniques: Counterfactual Data Augmentation (CDA), Dropout, Iterative Nullspace Projection, Self-Debias, and SentenceDebias. We quantify the effectiveness of each technique using three intrinsic bias benchmarks while also measuring the impact of these techniques on a model’s language modeling ability, as well as its performance on downstream NLU tasks. We experimentally find that: (1) Self-Debias is the strongest debiasing technique, obtaining improved scores on all bias benchmarks; (2) Current debiasing techniques perform less consistently when mitigating non-gender biases; And (3) improvements on bias benchmarks such as StereoSet and CrowS-Pairs by using debiasing strategies are often accompanied by a decrease in language modeling ability, making it difficult to determine whether the bias mitigation was effective.
Lifelong Topological Visual Navigation
Rey Reza Wiyatno
Anqi Xu
Commonly, learning-based topological navigation approaches produce a local policy while preserving some loose connectivity of the space thro… (voir plus)ugh a topological map. Nevertheless, spurious or missing edges in the topological graph often lead to navigation failure. In this work, we propose a sampling-based graph building method, which results in sparser graphs yet with higher navigation performance compared to baseline methods. We also propose graph maintenance strategies that eliminate spurious edges and expand the graph as needed, which improves lifelong navigation performance. Unlike controllers that learn from fixed training environments, we show that our model can be fine-tuned using only a small number of collected trajectory images from a real-world environment where the agent is deployed. We demonstrate successful navigation after fine-tuning on real-world environments, and notably show significant navigation improvements over time by applying our lifelong graph maintenance strategies.
The Power of Prompt Tuning for Low-Resource Semantic Parsing
Nathan Schucher
Harm de Vries
Prompt tuning has recently emerged as an effective method for adapting pre-trained language models to a number of language understanding and… (voir plus) generation tasks. In this paper, we investigate prompt tuning for semantic parsing—the task of mapping natural language utterances onto formal meaning representations. On the low-resource splits of Overnight and TOPv2, we find that a prompt tuned T5-xl significantly outperforms its fine-tuned counterpart, as well as strong GPT-3 and BART baselines. We also conduct ablation studies across different model scales and target representations, finding that, with increasing model scale, prompt tuned T5 models improve at generating target representations that are far from the pre-training distribution.
Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining
Andreas Madsen
Nicholas Meade
Vaibhav Adlakha
To explain NLP models a popular approach is to use importance measures, such as attention, which inform input tokens are important for makin… (voir plus)g a prediction. However, an open question is how well these explanations accurately reflect a model's logic, a property called faithfulness. To answer this question, we propose Recursive ROAR, a new faithfulness metric. This works by recursively masking allegedly important tokens and then retraining the model. The principle is that this should result in worse model performance compared to masking random tokens. The result is a performance curve given a masking-ratio. Furthermore, we propose a summarizing metric using relative area-between-curves (RACU), which allows for easy comparison across papers, models, and tasks. We evaluate 4 different importance measures on 8 different datasets, using both LSTM-attention models and RoBERTa models. We find that the faithfulness of importance measures is both model-dependent and task-dependent. This conclusion contradicts previous evaluations in both computer vision and faithfulness of attention literature.
Evaluation of real-life use of Point-Of-Care Rapid Antigen TEsting for SARS-CoV-2 in schools for outbreak control (EPOCRATES)
A. Blanchard
Marc Desforges
A. Labbé
C. Nguyen
Y. Petit
Derek Besner
Kate A. Zinszer
Olivier Séguin
Zineb Laghdir
K. Adams
Marie-ève Benoit
Ghislain Leduc
Jean Longtin
Ioannis. Ragoussis
Caroline Quach
We evaluated the use of rapid antigen detection tests (RADT) for the diagnosis of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-… (voir plus)2) infection in school settings to determine RADT performance characteristics compared to PCR. Methods: We did a real-world, prospective observational cohort study where recruited high-school students and staff from two high-schools in Montreal (Canada) were followed from January 25th to June 10th, 2021. Twenty-five percent of asymptomatic participants were tested weekly by RADT (nasal) and PCR (gargle). Class contacts of a case were tested. Symptomatic participants were tested by RADT (nasal) and PCR (nasal and gargle). The number of cases/outbreak and number of outbreaks were compared to other high schools in the same area. Results: Overall, 2,099 students and 286 school staff members consented to participate. The overall RADT specificity varied from 99.8 to 100%, with a lower sensitivity, varying from 28.6% in asymptomatic to 83.3% in symptomatic participants. The number of outbreaks was not different in the 2 participating schools compared to other high schools in the same area, but included a greater proportion of asymptomatic cases. Returning students to school after a 7-day quarantine, with a negative PCR on D6-7 after exposure, did not lead to subsequent outbreaks, as shown by serial testing. Of cases for whom the source was known, 37 of 57 (72.5%) were secondary to household transmission, 13 (25%) to intra-school transmission and one to community contacts between students in the same school. Conclusion: RADT did not perform well as a screening tool in asymptomatic individuals. Reinforcing policies for symptom screening when entering schools and testing symptomatic individuals with RADT on the spot may avoid subsequent significant exposures in class.
Compositional Generalization in Dependency Parsing
Compositionality— the ability to combine familiar units like words into novel phrases and sentences— has been the focus of intense inter… (voir plus)est in artificial intelligence in recent years. To test compositional generalization in semantic parsing, Keysers et al. (2020) introduced Compositional Freebase Queries (CFQ). This dataset maximizes the similarity between the test and train distributions over primitive units, like words, while maximizing the compound divergence: the dissimilarity between test and train distributions over larger structures, like phrases. Dependency parsing, however, lacks a compositional generalization benchmark. In this work, we introduce a gold-standard set of dependency parses for CFQ, and use this to analyze the behaviour of a state-of-the art dependency parser (Qi et al., 2020) on the CFQ dataset. We find that increasing compound divergence degrades dependency parsing performance, although not as dramatically as semantic parsing performance. Additionally, we find the performance of the dependency parser does not uniformly degrade relative to compound divergence, and the parser performs differently on different splits with the same compound divergence. We explore a number of hypotheses for what causes the non-uniform degradation in dependency parsing performance, and identify a number of syntactic structures that drive the dependency parser’s lower performance on the most challenging splits.