Publications

Myelin basic protein mRNA levels affect myelin sheath dimensions, architecture, plasticity, and density of resident glial cells
Hooman Bagheri
Hana Friedman
Amanda Hadwen
Celia Jarweh
Ellis Cooper
Lawrence Oprea
Claire Guerrier
Anmar Khadra
Armand Collin
Amanda Young
Gerardo Mendez Victoriano
Matthew Swire
Andrew Jarjour
Marie E. Bechler
Rachel S. Pryce
Pierre Chaurand
Lise Cougnaud
Dajana Vuckovic
Elliott Wilion … (voir 11 de plus)
Owen Greene
Akiko Nishiyama
Anouk Benmamar‐Badel
Trevor Owens
Vladimir Grouza
Marius Tuznik
Hanwen Liu
David A. Rudko
Jinyi Zhang
Katherine A. Siminovitch
Alan C. Peterson
The Madness of Multiple Entries in March Madness
Jeff Decary
David Bergman
Carlos Henrique Cardonha
Jason Imbrogno
Andrea Lodi
This paper explores multi-entry strategies for betting pools related to single-elimination tournaments. In such betting pools, participants … (voir plus)select winners of games, and their respective score is a weighted sum of the number of correct selections. Most betting pools have a top-heavy payoff structure, so the paper focuses on strategies that maximize the expected score of the best-performing entry. There is no known closed-formula expression for the estimation of this metric, so the paper investigates the challenges associated with the estimation and the optimization of multi-entry solutions. We present an exact dynamic programming approach for calculating the maximum expected score of any given fixed solution, which is exponential in the number of entries. We explore the structural properties of the problem to develop several solution techniques. In particular, by extracting insights from the solutions produced by one of our algorithms, we design a simple yet effective problem-specific heuristic that was the best-performing technique in our experiments, which were based on real-world data extracted from recent March Madness tournaments. In particular, our results show that the best 100-entry solution identified by our heuristic had a 2.2% likelihood of winning a
Tree semantic segmentation from aerial image time series
Venkatesh Ramesh
Arthur Ouaknine
Chronosymbolic Learning: Efficient CHC Solving with Symbolic Reasoning and Inductive Learning
Ziyan Luo
Solving Constrained Horn Clauses (CHCs) is a fundamental challenge behind a wide range of verification and analysis tasks. Data-driven appro… (voir plus)aches show great promise in improving CHC solving without the painstaking manual effort of creating and tuning various heuristics. However, a large performance gap exists between data-driven CHC solvers and symbolic reasoning-based solvers. In this work, we develop a simple but effective framework,"Chronosymbolic Learning", which unifies symbolic information and numerical data points to solve a CHC system efficiently. We also present a simple instance of Chronosymbolic Learning with a data-driven learner and a BMC-styled reasoner. Despite its great simplicity, experimental results show the efficacy and robustness of our tool. It outperforms state-of-the-art CHC solvers on a dataset consisting of 288 benchmarks, including many instances with non-linear integer arithmetics.
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models
Ayush Kaushal
Tejas Pandey
Tejas Vaidhya
Aaryan Bhagat
When can transformers compositionally generalize in-context?
Seijin Kobayashi
Simon Schug
Yassir Akram
Florian Redhardt
Johannes von Oswald
Razvan Pascanu
João Sacramento
Many tasks can be composed from a few independent components. This gives rise to a combinatorial explosion of possible tasks, only some of w… (voir plus)hich might be encountered during training. Under what circumstances can transformers compositionally generalize from a subset of tasks to all possible combinations of tasks that share similar components? Here we study a modular multitask setting that allows us to precisely control compositional structure in the data generation process. We present evidence that transformers learning in-context struggle to generalize compositionally on this task despite being in principle expressive enough to do so. Compositional generalization becomes possible only when introducing a bottleneck that enforces an explicit separation between task inference and task execution.
scSemiProfiler: Advancing Large-scale Single-cell Studies through Semi-profiling with Deep Generative Models and Active Learning
Jingtao Wang
Gregory Fonseca
A model-free approach for solving choice-based competitive facility location problems using simulation and submodularity
Robin Legault
This paper considers facility location problems in which a firm entering a market seeks to open facilities on a subset of candidate location… (voir plus)s so as to maximize its expected market share, assuming that customers choose the available alternative that maximizes a random utility function. We introduce a deterministic equivalent reformulation of this stochastic problem as a maximum covering location problem with an exponential number of demand points, each of which is covered by a different set of candidate locations. Estimating the prevalence of these preference profiles through simulation generalizes a sample average approximation method from the literature and results in a maximum covering location problem of manageable size. To solve it, we develop a partial Benders reformulation in which the contribution to the objective of the least influential preference profiles is aggregated and bounded by submodular cuts. This set of profiles is selected by a knee detection method that seeks to identify the best tradeoff between the fraction of the demand that is retained in the master problem and the size of the model. We develop a theoretical analysis of our approach and show that the solution quality it provides for the original stochastic problem, its computational performance, and the automatic profile-retention strategy it exploits are directly connected to the entropy of the preference profiles in the population. Computational experiments indicate that our approach dominates the classical sample average approximation method on large instances, can outperform the best heuristic method from the literature under the multinomial logit model, and achieves state-of-the-art results under the mixed multinomial logit model. We characterize a broader class of problems, which includes assortment optimization, to which the solving methodology and the analyses developed in this paper can be extended.
Learn-To-Design: Reinforcement Learning-Assisted Chemical Process Optimization
Eslam G. Al-Sakkari
Ahmed Ragab
Mohamed Ali
Daria C. Boffito
Mouloud Amazouz
The Heterophilic Graph Learning Handbook: Benchmarks, Models, Theoretical Analysis, Applications and Challenges
Sitao Luan
Chenqing Hua
Qincheng Lu
Liheng Ma
Lirong Wu
Xinyu Wang
Minkai Xu
Xiao-Wen Chang
Rex Ying
Stan Z. Li
Stefanie Jegelka
Homophily principle, \ie{} nodes with the same labels or similar attributes are more likely to be connected, has been commonly believed to b… (voir plus)e the main reason for the superiority of Graph Neural Networks (GNNs) over traditional Neural Networks (NNs) on graph-structured data, especially on node-level tasks. However, recent work has identified a non-trivial set of datasets where GNN's performance compared to the NN's is not satisfactory. Heterophily, i.e. low homophily, has been considered the main cause of this empirical observation. People have begun to revisit and re-evaluate most existing graph models, including graph transformer and its variants, in the heterophily scenario across various kinds of graphs, e.g. heterogeneous graphs, temporal graphs and hypergraphs. Moreover, numerous graph-related applications are found to be closely related to the heterophily problem. In the past few years, considerable effort has been devoted to studying and addressing the heterophily issue. In this survey, we provide a comprehensive review of the latest progress on heterophilic graph learning, including an extensive summary of benchmark datasets and evaluation of homophily metrics on synthetic graphs, meticulous classification of the most updated supervised and unsupervised learning methods, thorough digestion of the theoretical analysis on homophily/heterophily, and broad exploration of the heterophily-related applications. Notably, through detailed experiments, we are the first to categorize benchmark heterophilic datasets into three sub-categories: malignant, benign and ambiguous heterophily. Malignant and ambiguous datasets are identified as the real challenging datasets to test the effectiveness of new models on the heterophily challenge. Finally, we propose several challenges and future directions for heterophilic graph representation learning.
A Comprehensive Analysis of Explainable AI for Malware Hunting
Mohd Saqib
Samaneh Mahdavifar
Philippe Charland
In the past decade, the number of malware variants has increased rapidly. Many researchers have proposed to detect malware using intelligent… (voir plus) techniques, such as Machine Learning (ML) and Deep Learning (DL), which have high accuracy and precision. These methods, however, suffer from being opaque in the decision-making process. Therefore, we need Artificial Intelligence (AI)-based models to be explainable, interpretable, and transparent to be reliable and trustworthy. In this survey, we reviewed articles related to Explainable AI (XAI) and their application to the significant scope of malware detection. The article encompasses a comprehensive examination of various XAI algorithms employed in malware analysis. Moreover, we have addressed the characteristics, challenges, and requirements in malware analysis that cannot be accommodated by standard XAI methods. We discussed that even though Explainable Malware Detection (EMD) models provide explainability, they make an AI-based model more vulnerable to adversarial attacks. We also propose a framework that assigns a level of explainability to each XAI malware analysis model, based on the security features involved in each method. In summary, the proposed project focuses on combining XAI and malware analysis to apply XAI models for scrutinizing the opaque nature of AI systems and their applications to malware analysis.
DeepCodeProbe: Towards Understanding What Models Trained on Code Learn
Vahid Majdinasab
Amin Nikanjam