Publications

Tree semantic segmentation from aerial image time series
Venkatesh Ramesh
Arthur Ouaknine
Chronosymbolic Learning: Efficient CHC Solving with Symbolic Reasoning and Inductive Learning
Ziyan Luo
Solving Constrained Horn Clauses (CHCs) is a fundamental challenge behind a wide range of verification and analysis tasks. Data-driven appro… (see more)aches show great promise in improving CHC solving without the painstaking manual effort of creating and tuning various heuristics. However, a large performance gap exists between data-driven CHC solvers and symbolic reasoning-based solvers. In this work, we develop a simple but effective framework,"Chronosymbolic Learning", which unifies symbolic information and numerical data points to solve a CHC system efficiently. We also present a simple instance of Chronosymbolic Learning with a data-driven learner and a BMC-styled reasoner. Despite its great simplicity, experimental results show the efficacy and robustness of our tool. It outperforms state-of-the-art CHC solvers on a dataset consisting of 288 benchmarks, including many instances with non-linear integer arithmetics.
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models
Ayush Kaushal
Tejas Pandey
Tejas Vaidhya
Aaryan Bhagat
UTG: Towards a Unified View of Snapshot and Event Based Models for Temporal Graphs
Shenyang Huang
Farimah Poursafaei
Emanuele Rossi
When can transformers compositionally generalize in-context?
Seijin Kobayashi
Simon Schug
Yassir Akram
Florian Redhardt
Johannes von Oswald
Razvan Pascanu
João Sacramento
Many tasks can be composed from a few independent components. This gives rise to a combinatorial explosion of possible tasks, only some of w… (see more)hich might be encountered during training. Under what circumstances can transformers compositionally generalize from a subset of tasks to all possible combinations of tasks that share similar components? Here we study a modular multitask setting that allows us to precisely control compositional structure in the data generation process. We present evidence that transformers learning in-context struggle to generalize compositionally on this task despite being in principle expressive enough to do so. Compositional generalization becomes possible only when introducing a bottleneck that enforces an explicit separation between task inference and task execution.
scSemiProfiler: Advancing Large-scale Single-cell Studies through Semi-profiling with Deep Generative Models and Active Learning
Jingtao Wang
Gregory Fonseca
A model-free approach for solving choice-based competitive facility location problems using simulation and submodularity
Robin Legault
This paper considers facility location problems in which a firm entering a market seeks to open facilities on a subset of candidate location… (see more)s so as to maximize its expected market share, assuming that customers choose the available alternative that maximizes a random utility function. We introduce a deterministic equivalent reformulation of this stochastic problem as a maximum covering location problem with an exponential number of demand points, each of which is covered by a different set of candidate locations. Estimating the prevalence of these preference profiles through simulation generalizes a sample average approximation method from the literature and results in a maximum covering location problem of manageable size. To solve it, we develop a partial Benders reformulation in which the contribution to the objective of the least influential preference profiles is aggregated and bounded by submodular cuts. This set of profiles is selected by a knee detection method that seeks to identify the best tradeoff between the fraction of the demand that is retained in the master problem and the size of the model. We develop a theoretical analysis of our approach and show that the solution quality it provides for the original stochastic problem, its computational performance, and the automatic profile-retention strategy it exploits are directly connected to the entropy of the preference profiles in the population. Computational experiments indicate that our approach dominates the classical sample average approximation method on large instances, can outperform the best heuristic method from the literature under the multinomial logit model, and achieves state-of-the-art results under the mixed multinomial logit model. We characterize a broader class of problems, which includes assortment optimization, to which the solving methodology and the analyses developed in this paper can be extended.
Learn-To-Design: Reinforcement Learning-Assisted Chemical Process Optimization
Eslam G. Al-Sakkari
Ahmed Ragab
Mohamed Ali
Daria C. Boffito
Mouloud Amazouz
The Heterophilic Graph Learning Handbook: Benchmarks, Models, Theoretical Analysis, Applications and Challenges
Sitao Luan
Chenqing Hua
Qincheng Lu
Liheng Ma
Lirong Wu
Xinyu Wang
Minkai Xu
Xiao-Wen Chang
Rex Ying
Stan Z. Li
Stefanie Jegelka
Homophily principle, \ie{} nodes with the same labels or similar attributes are more likely to be connected, has been commonly believed to b… (see more)e the main reason for the superiority of Graph Neural Networks (GNNs) over traditional Neural Networks (NNs) on graph-structured data, especially on node-level tasks. However, recent work has identified a non-trivial set of datasets where GNN's performance compared to the NN's is not satisfactory. Heterophily, i.e. low homophily, has been considered the main cause of this empirical observation. People have begun to revisit and re-evaluate most existing graph models, including graph transformer and its variants, in the heterophily scenario across various kinds of graphs, e.g. heterogeneous graphs, temporal graphs and hypergraphs. Moreover, numerous graph-related applications are found to be closely related to the heterophily problem. In the past few years, considerable effort has been devoted to studying and addressing the heterophily issue. In this survey, we provide a comprehensive review of the latest progress on heterophilic graph learning, including an extensive summary of benchmark datasets and evaluation of homophily metrics on synthetic graphs, meticulous classification of the most updated supervised and unsupervised learning methods, thorough digestion of the theoretical analysis on homophily/heterophily, and broad exploration of the heterophily-related applications. Notably, through detailed experiments, we are the first to categorize benchmark heterophilic datasets into three sub-categories: malignant, benign and ambiguous heterophily. Malignant and ambiguous datasets are identified as the real challenging datasets to test the effectiveness of new models on the heterophily challenge. Finally, we propose several challenges and future directions for heterophilic graph representation learning.
A Comprehensive Analysis of Explainable AI for Malware Hunting
Mohd Saqib
Samaneh Mahdavifar
Philippe Charland
In the past decade, the number of malware variants has increased rapidly. Many researchers have proposed to detect malware using intelligent… (see more) techniques, such as Machine Learning (ML) and Deep Learning (DL), which have high accuracy and precision. These methods, however, suffer from being opaque in the decision-making process. Therefore, we need Artificial Intelligence (AI)-based models to be explainable, interpretable, and transparent to be reliable and trustworthy. In this survey, we reviewed articles related to Explainable AI (XAI) and their application to the significant scope of malware detection. The article encompasses a comprehensive examination of various XAI algorithms employed in malware analysis. Moreover, we have addressed the characteristics, challenges, and requirements in malware analysis that cannot be accommodated by standard XAI methods. We discussed that even though Explainable Malware Detection (EMD) models provide explainability, they make an AI-based model more vulnerable to adversarial attacks. We also propose a framework that assigns a level of explainability to each XAI malware analysis model, based on the security features involved in each method. In summary, the proposed project focuses on combining XAI and malware analysis to apply XAI models for scrutinizing the opaque nature of AI systems and their applications to malware analysis.
DeepCodeProbe: Towards Understanding What Models Trained on Code Learn
Vahid Majdinasab
Amin Nikanjam
Group Membership Bias
Ali Vardasbi
Maarten de Rijke
Mostafa Dehghani