Publications

Towards Detecting Contextual Real-Time Toxicity for In-Game Chat
Nicolas Grenon-Godbout
Real-time toxicity detection in online environments poses a significant challenge, due to the increasing prevalence of social media and gami… (voir plus)ng platforms. We introduce ToxBuster, a simple and scalable model that reliably detects toxic content in real-time for a line of chat by including chat history and metadata. ToxBuster consistently outperforms conventional toxicity models across popular multiplayer games, including Rainbow Six Siege, For Honor, and DOTA 2. We conduct an ablation study to assess the importance of each model component and explore ToxBuster's transferability across the datasets. Furthermore, we showcase ToxBuster's efficacy in post-game moderation, successfully flagging 82.1% of chat-reported players at a precision level of 90.0%. Additionally, we show how an additional 6% of unreported toxic players can be proactively moderated.
Towards Learning to Imitate from a Single Video Demonstration
Christopher Pal
Agents that can learn to imitate given video observation -- \emph{without direct access to state or action information} are more applicable … (voir plus)to learning in the natural world. However, formulating a reinforcement learning (RL) agent that facilitates this goal remains a significant challenge. We approach this challenge using contrastive training to learn a reward function comparing an agent's behaviour with a single demonstration. We use a Siamese recurrent neural network architecture to learn rewards in space and time between motion clips while training an RL policy to minimize this distance. Through experimentation, we also find that the inclusion of multi-task data and additional image encoding losses improve the temporal consistency of the learned rewards and, as a result, significantly improves policy learning. We demonstrate our approach on simulated humanoid, dog, and raptor agents in 2D and a quadruped and a humanoid in 3D. We show that our method outperforms current state-of-the-art techniques in these environments and can learn to imitate from a single video demonstration.
Towards Reliable Neural Specifications
Nham Le
Zhaoyue Wang
Arie Gurfinkel
TrafficVis: Visualizing Organized Activity and Spatio-Temporal Patterns for Detecting and Labeling Human Trafficking
Catalina Vajiac
Duen Horng Chau
Andreas Olligschlaeger
Rebecca Mackenzie
Meng-Chieh Lee
Namyong Park
Christos Faloutsos
Law enforcement and domain experts can detect human trafficking (HT) in online escort websites by analyzing suspicious clusters of connected… (voir plus) ads. How can we explain clustering results intuitively and interactively, visualizing potential evidence for experts to analyze? We present TrafficVis, the first interface for cluster-level HT detection and labeling. Developed through months of participatory design with domain experts, TrafficVis provides coordinated views in conjunction with carefully chosen backend algorithms to effectively show spatio-temporal and text patterns to a wide variety of anti-HT stakeholders. We build upon state-of-the-art text clustering algorithms by incorporating shared metadata as a signal of connected and possibly suspicious activity, then visualize the results. Domain experts can use TrafficVis to label clusters as HT, or other, suspicious, but non-HT activity such as spam and scam, quickly creating labeled datasets to enable further HT research. Through domain expert feedback and a usage scenario, we demonstrate TRAFFICVIS's efficacy. The feedback was overwhelmingly positive, with repeated high praises for the usability and explainability of our tool, the latter being vital for indicting possible criminals.
Transposable elements regulate thymus development and function 1
Jean-David Larouche
Céline M. Laumont
Krystel Vincent
Leslie Hesnard
Sylvie Brochu
Caroline Côté
Juliette Humeau
Éric Bonneil
Joël Lanoix
Chantal Durette
Patrick Gendron
Jean-Philippe Laverdure
Ellen Rothman Richie
S. Lemieux
Pierre Thibault
Claude Perreault
21 Transposable elements (TE) are repetitive sequences representing ~45% of the human and mouse genomes 22 and are highly expressed by medul… (voir plus)lary thymic epithelial cells (mTEC). In this study, we investigated the 23 role of transposable elements (TE), which are highly expressed by medullary thymic epithelial cells 24 (mTEC), on T-cell development in the thymus. We performed multi-omic analyses of TEs in human and 25 mouse thymic cells to elucidate their role in T cell development. We report that TE expression in the 26 human thymus is high and shows extensive ageand cell lineage-related variations. TEs interact with 27 multiple transcription factors in all cell types of the human thymus. Two cell types express particularly 28 broad TE repertoires: mTECs and plasmacytoid dendritic cells (pDC). In mTECs, TEs interact with 29 transcription factors essential for mTEC development and function (e.g., PAX1 and RELB) and generate 30 MHC-I-associated peptides implicated in thymocyte education. Notably, AIRE, FEZF2, and CHD4 31 regulate non-redundant sets of TEs in murine mTECs. Human thymic pDCs homogenously express large 32 numbers of TEs that lead to the formation of dsRNA, triggering RIG-I and MDA5 signaling and 33 explaining why thymic pDCs constitutively secrete IFN ɑ/β. This study illustrates the diversity of 34 interactions between TEs and the adaptive immune system. TEs are genetic parasites, and the two thymic 35 cell types most affected by TEs (mTEcs and pDCs) are essential to establishing central T-cell tolerance. 36 Therefore, we propose that the orchestration of TE expression in thymic cells is critical to prevent 37 autoimmunity in vertebrates. 38
Tree Cross Attention
Frederick Tung
Hossein Hajimirsadeghi
Mohamed Osama Ahmed
Cross Attention is a popular method for retrieving information from a set of context tokens for making predictions. At inference time, for e… (voir plus)ach prediction, Cross Attention scans the full set of
Trophic interaction models predict interactions across space, not food webs.
Dominique Caron
Ulrich Brose
Miguel Lurgi
F. Guillaume Blanchet
Dominique Gravel
Aim: Trophic interactions are central to our understanding of essential ecosystem functions as well as their stability. Predicting these int… (voir plus)eractions has become increasingly common due to the lack of empirical data on trophic interactions for most taxa in most ecosystems. We aim to determine how far and accurately trophic interaction models extrapolate to new communities both in terms of pairwise predator-prey interactions and higher level food web attributes (i.e., species position, food web-level properties).
Ultrastructure Analysis of Cardiomyocytes and Their Nuclei
Tabish A Syed
Drisya Dileep
Minhajuddin Sirajuddin
Understanding Graph Neural Networks with Generalized Geometric Scattering Transforms
Michael Perlmutter
Feng Gao
Matthew Hirn
The scattering transform is a multilayered wavelet-based deep learning architecture that acts as a model of convolutional neural networks. R… (voir plus)ecently, several works have introduced generalizations of the scattering transform for non-Euclidean settings such as graphs. Our work builds upon these constructions by introducing windowed and non-windowed geometric scattering transforms for graphs based upon a very general class of asymmetric wavelets. We show that these asymmetric graph scattering transforms have many of the same theoretical guarantees as their symmetric counterparts. As a result, the proposed construction unifies and extends known theoretical results for many of the existing graph scattering architectures. In doing so, this work helps bridge the gap between geometric scattering and other graph neural networks by introducing a large family of networks with provable stability and invariance guarantees. These results lay the groundwork for future deep learning architectures for graph-structured data that have learned filters and also provably have desirable theoretical properties.
Unsupervised Improvement of Audio-Text Cross-Modal Representations
Zhepei Wang
Krishna Subramani
Junkai Wu
Tiago Tavares
Fabio Ayres
Paris Smaragdis
Recent advances in using language models to obtain cross-modal audio-text representations have overcome the limitations of conventional trai… (voir plus)ning approaches that use predefined labels. This has allowed the community to make progress in tasks like zero-shot classification, which would otherwise not be possible. However, learning such representations requires a large amount of human-annotated audio-text pairs. In this paper, we study unsupervised approaches to improve the learning framework of such representations with unpaired text and audio. We explore domain-unspecific and domain-specific curation methods to create audio-text pairs that we use to further improve the model. We also show that when domain-specific curation is used in conjunction with a soft-labeled contrastive loss, we are able to obtain significant improvement in terms of zero-shot classification performance on downstream sound event classification or acoustic scene classification tasks.
Unsupervised Layer-wise Score Aggregation for Textual OOD Detection
Guillaume Staerman
Eduardo DC GOMEZ
Jackie CK Cheung
Pierre Colombo
Out-of-distribution (OOD) detection is a rapidly growing field due to new robustness and security requirements driven by an increased number… (voir plus) of AI-based systems. Existing OOD textual detectors often rely on an anomaly score (e.g., Mahalanobis distance) computed on the embedding output of the last layer of the encoder. In this work, we observe that OOD detection performance varies greatly depending on the task and layer output. More importantly, we show that the usual choice (the last layer) is rarely the best one for OOD detection and that far better results could be achieved if the best layer were picked. To leverage this observation, we propose a data-driven, unsupervised method to combine layer-wise anomaly scores. In addition, we extend classical textual OOD benchmarks by including classification tasks with a greater number of classes (up to 77), which reflects more realistic settings. On this augmented benchmark, we show that the proposed post-aggregation methods achieve robust and consistent results while removing manual feature selection altogether. Their performance achieves near oracle's best layer performance.
Use of machine learning in pediatric surgical clinical prediction tools: A systematic review.
Amanda Bianco
Zaid A.M. Al-Azzawi
Elena Guadagno
Esli Osmanlliu
Jocelyn Gravel