Publications

Learning Minimal NAP Specifications for Neural Network Verification
Chuqin Geng
Zhaoyue Wang
Haolin Ye
Saifei Liao
Specifications play a crucial role in neural network verification. They define the precise input regions we aim to verify, typically represe… (voir plus)nted as L-infinity norm balls. While recent research suggests using neural activation patterns (NAPs) as specifications for verifying unseen test set data, it focuses on computing the most refined NAPs, often limited to very small regions in the input space. In this paper, we study the following problem: Given a neural network, find a minimal (coarsest) NAP that is sufficient for formal verification of the network's robustness. Finding the minimal NAP specification not only expands verifiable bounds but also provides insights into which neurons contribute to the model's robustness. To address this problem, we propose several exact and approximate approaches. Our exact approaches leverage the verification tool to find minimal NAP specifications in either a deterministic or statistical manner. Whereas the approximate methods efficiently estimate minimal NAPs using adversarial examples and local gradients, without making calls to the verification tool. This allows us to inspect potential causal links between neurons and the robustness of state-of-the-art neural networks, a task for which existing verification frameworks fail to scale. Our experimental results suggest that minimal NAP specifications require much smaller fractions of neurons compared to the most refined NAP specifications, yet they can significantly expand the verifiable boundaries to several orders of magnitude larger.
SAT-DIFF: A Tree Diffing Framework Using SAT Solving
Chuqin Geng
Haolin Ye
Yihan Zhang
Brigitte Pientka
Computing differences between tree-structured data is a critical but challenging problem in software analysis. In this paper, we propose a n… (voir plus)ovel tree diffing approach called SatDiff, which reformulates the structural diffing problem into a MaxSAT problem. By encoding the necessary transformations from the source tree to the target tree, SatDiff generates correct, minimal, and type safe low-level edit scripts with formal guarantees. We then synthesize concise high-level edit scripts by effectively merging low-level edits in the appropriate topological order. Our empirical results demonstrate that SatDiff outperforms existing heuristic-based approaches by a significant margin in terms of conciseness while maintaining a reasonable runtime.
PopulAtion Parameter Averaging (PAPA)
Alexia Jolicoeur-Martineau
Emy Gervais
Kilian FATRAS
Yan Zhang
Regulating advanced artificial agents
Michael K. Cohen
Noam Kolt
Gillian K. Hadfield
Stuart Russell
Scope Ambiguities in Large Language Models
Gaurav Kamath
Sebastian Schuster
Sowmya Vajjala
Applying Recurrent Neural Networks and Blocked Cross-Validation to Model Conventional Drinking Water Treatment Processes
Aleksandar Jakovljevic
Benoit Barbeau
The jar test is the current standard method for predicting the performance of a conventional drinking water treatment (DWT) process and opti… (voir plus)mizing the coagulant dose. This test is time-consuming and requires human intervention, meaning it is infeasible for making continuous process predictions. As a potential alternative, we developed a machine learning (ML) model from historical DWT plant data that can operate continuously using real-time sensor data without human intervention for predicting clarified water turbidity 15 min in advance. We evaluated three types of models: multilayer perceptron (MLP), the long short-term memory (LSTM) recurrent neural network (RNN), and the gated recurrent unit (GRU) RNN. We also employed two training methodologies: the commonly used holdout method and the theoretically correct blocked cross-validation (BCV) method. We found that the RNN with GRU was the best model type overall and achieved a mean absolute error on an independent production set of as low as 0.044 NTU. We further found that models trained using BCV typically achieve errors equal to or lower than their counterparts trained using holdout. These results suggest that RNNs trained using BCV are superior for the development of ML models for DWT processes compared to those reported in earlier literature.
Assessing the emergence time of SARS-CoV-2 zoonotic spillover
Stéphane Samson
Étienne Lord
Layerwise Early Stopping for Test Time Adaptation
Sabyasachi Sahoo
Mostafa ElAraby
Jonas Ngnawe
Yann Pequignot
Frederic Precioso
Dynamic Neural Control Flow Execution: An Agent-Based Deep Equilibrium Approach for Binary Vulnerability Detection
Litao Li
Steven H. H. Ding
Andrew Walenstein
Philippe Charland
Explainable artificial intelligence models for predicting risk of suicide using health administrative data in Quebec
Fatemeh Gholi Zadeh Kharrat
Alain Lesage
Geneviève Gariépy
Jean-François Pelletier
Camille Brousseau-Paradis
Louis Rochette
Eric Pelletier
Pascale Lévesque
Mada Mohammed
JianLi Wang
Suicide is a complex, multidimensional event, and a significant challenge for prevention globally. Artificial intelligence (AI) and machine … (voir plus)learning (ML) have emerged to harness large-scale datasets to enhance risk detection. In order to trust and act upon the predictions made with ML, more intuitive user interfaces must be validated. Thus, Interpretable AI is one of the crucial directions which could allow policy and decision makers to make reasonable and data-driven decisions that can ultimately lead to better mental health services planning and suicide prevention. This research aimed to develop sex-specific ML models for predicting the population risk of suicide and to interpret the models. Data were from the Quebec Integrated Chronic Disease Surveillance System (QICDSS), covering up to 98% of the population in the province of Quebec and containing data for over 20,000 suicides between 2002 and 2019. We employed a case-control study design. Individuals were considered cases if they were aged 15+ and had died from suicide between January 1st, 2002, and December 31st, 2019 (n = 18339). Controls were a random sample of 1% of the Quebec population aged 15+ of each year, who were alive on December 31st of each year, from 2002 to 2019 (n = 1,307,370). We included 103 features, including individual, programmatic, systemic, and community factors, measured up to five years prior to the suicide events. We trained and then validated the sex-specific predictive risk model using supervised ML algorithms, including Logistic Regression (LR), Random Forest (RF), Extreme Gradient Boosting (XGBoost) and Multilayer perceptron (MLP). We computed operating characteristics, including sensitivity, specificity, and Positive Predictive Value (PPV). We then generated receiver operating characteristic (ROC) curves to predict suicides and calibration measures. For interpretability, Shapley Additive Explanations (SHAP) was used with the global explanation to determine how much the input features contribute to the models’ output and the largest absolute coefficients. The best sensitivity was 0.38 with logistic regression for males and 0.47 with MLP for females; the XGBoost Classifier with 0.25 for males and 0.19 for females had the best precision (PPV). This study demonstrated the useful potential of explainable AI models as tools for decision-making and population-level suicide prevention actions. The ML models included individual, programmatic, systemic, and community levels variables available routinely to decision makers and planners in a public managed care system. Caution shall be exercised in the interpretation of variables associated in a predictive model since they are not causal, and other designs are required to establish the value of individual treatments. The next steps are to produce an intuitive user interface for decision makers, planners and other stakeholders like clinicians or representatives of families and people with live experience of suicidal behaviors or death by suicide. For example, how variations in the quality of local area primary care programs for depression or substance use disorders or increased in regional mental health and addiction budgets would lower suicide rates.
Rethinking Teacher-Student Curriculum Learning through the Cooperative Mechanics of Experience
Manfred Diaz
Andrea Tacchetti
Teacher-Student Curriculum Learning (TSCL) is a curriculum learning framework that draws inspiration from human cultural transmission and le… (voir plus)arning. It involves a teacher algorithm shaping the learning process of a learner algorithm by exposing it to controlled experiences. Despite its success, understanding the conditions under which TSCL is effective remains challenging. In this paper, we propose a data-centric perspective to analyze the underlying mechanics of the teacher-student interactions in TSCL. We leverage cooperative game theory to describe how the composition of the set of experiences presented by the teacher to the learner, as well as their order, influences the performance of the curriculum that is found by TSCL approaches. To do so, we demonstrate that for every TSCL problem, there exists an equivalent cooperative game, and several key components of the TSCL framework can be reinterpreted using game-theoretic principles. Through experiments covering supervised learning, reinforcement learning, and classical games, we estimate the cooperative values of experiences and use value-proportional curriculum mechanisms to construct curricula, even in cases where TSCL struggles. The framework and experimental setup we present in this work represent a novel foundation for a deeper exploration of TSCL, shedding light on its underlying mechanisms and providing insights into its broader applicability in machine learning.
Constrained Robotic Navigation on Preferred Terrains Using LLMs and Speech Instruction: Exploiting the Power of Adverbs
Faraz Lotfi
Farnoosh Faraji
Nikhil Kakodkar
Travis Manderson