NLP in the era of generative AI, cognitive sciences, and societal transformation
Join us at Mila in October for a three-day workshop to explore the transformative potential of language technologies and their implications for society.
This program is designed to provide decision-makers, policymakers and professional working in policy with a foundational understanding of AI technology.
We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Publications
SkillQG: Learning to Generate Question for Reading Comprehension Assessment
Deep learning-based algorithms have been very successful in skeleton-based human activity recognition. Skeleton data contains 2-D or 3-D coo… (see more)rdinates of human body joints. The main focus of most of the existing skeleton-based activity recognition methods is on designing new deep architectures to learn discriminative features, where all body joints are considered equally important in recognition. However, the importance of joints varies as an activity proceeds within a video and across different activities. In this work, we hypothesize that selecting relevant joints, prior to recognition, can enhance performance of the existing deep learning-based recognition models. We propose a spatial hard attention finding method that aims to remove the uninformative and/or misleading joints at each frame. We formulate the joint selection problem as a Markov decision process and employ deep reinforcement learning to train the proposed spatial-attention-aware agent. No extra labels are needed for the agent’s training. The agent takes a sequence of features extracted from skeleton video as input and outputs a sequence of probabilities for joints. The proposed method can be considered as a general framework that can be integrated with the existing skeleton-based activity recognition methods for performance improvement purposes. We obtain very competitive activity recognition results on three commonly used human activity recognition datasets.
2023-07-01
IEEE Transactions on Systems, Man, and Cybernetics: Systems (published)
Graph Neural Networks (GNNs) have emerged as a powerful tool for data-driven learning on various graph domains. They are usually based on a … (see more)message-passing mechanism and have gained increasing popularity for their intuitive formulation, which is closely linked to the Weisfeiler-Lehman (WL) test for graph isomorphism to which they have been proven equivalent in terms of expressive power. In this work, we establish new generalization properties and fundamental limits of GNNs in the context of learning so-called identity effects, i.e., the task of determining whether an object is composed of two identical components or not. Our study is motivated by the need to understand the capabilities of GNNs when performing simple cognitive tasks, with potential applications in computational linguistics and chemistry. We analyze two case studies: (i) two-letters words, for which we show that GNNs trained via stochastic gradient descent are unable to generalize to unseen letters when utilizing orthogonal encodings like one-hot representations; (ii) dicyclic graphs, i.e., graphs composed of two cycles, for which we present positive existence results leveraging the connection between GNNs and the WL test. Our theoretical analysis is supported by an extensive numerical study.
Abstract Motivation Accurately assessing contacts between DNA fragments inside the nucleus with Hi-C experiment is crucial for understanding… (see more) the role of 3D genome organization in gene regulation. This challenging task is due in part to the high sequencing depth of Hi-C libraries required to support high-resolution analyses. Most existing Hi-C data are collected with limited sequencing coverage, leading to poor chromatin interaction frequency estimation. Current computational approaches to enhance Hi-C signals focus on the analysis of individual Hi-C datasets of interest, without taking advantage of the facts that (i) several hundred Hi-C contact maps are publicly available and (ii) the vast majority of local spatial organizations are conserved across multiple cell types. Results Here, we present RefHiC-SR, an attention-based deep learning framework that uses a reference panel of Hi-C datasets to facilitate the enhancement of Hi-C data resolution of a given study sample. We compare RefHiC-SR against tools that do not use reference samples and find that RefHiC-SR outperforms other programs across different cell types, and sequencing depths. It also enables high-accuracy mapping of structures such as loops and topologically associating domains. Availability and implementation https://github.com/BlanchetteLab/RefHiC.
Disentanglement aims to recover meaningful latent ground-truth factors from the observed distribution solely, and is formalized through the … (see more)theory of identifiability. The identifiability of independent latent factors is proven to be impossible in the unsupervised i.i.d. setting under a general nonlinear map from factors to observations. In this work, however, we demonstrate that it is possible to recover quantized latent factors under a generic nonlinear diffeomorphism. We only assume that the latent factors have independent discontinuities in their density, without requiring the factors to be statistically independent. We introduce this novel form of identifiability, termed quantized factor identifiability, and provide a comprehensive proof of the recovery of the quantized factors.
Modeling strong gravitational lenses in order to quantify distortions in the images of background sources and to reconstruct the mass densit… (see more)y in foreground lenses has been a difficult computational challenge. As the quality of gravitational lens images increases, the task of fully exploiting the information they contain becomes computationally and algorithmically more difficult. In this work, we use a neural network based on the recurrent inference machine to reconstruct simultaneously an undistorted image of the background source and the lens mass density distribution as pixelated maps. The method iteratively reconstructs the model parameters (the image of the source and a pixelated density map) by learning the process of optimizing the likelihood given the data using the physical model (a ray-tracing simulation), regularized by a prior implicitly learned by the neural network through its training data. When compared to more traditional parametric models, the proposed method is significantly more expressive and can reconstruct complex mass distributions, which we demonstrate by using realistic lensing galaxies taken from the IllustrisTNG cosmological hydrodynamic simulation.
In this paper, hypernetworks are trained to generate behaviors across a range of unseen task conditions, via a novel TD-based training objec… (see more)tive and data from a set of near-optimal RL solutions for training tasks. This work relates to meta RL, contextual RL, and transfer learning, with a particular focus on zero-shot performance at test time, enabled by knowledge of the task parameters (also known as context). Our technical approach is based upon viewing each RL algorithm as a mapping from the MDP specifics to the near-optimal value function and policy and seek to approximate it with a hypernetwork that can generate near-optimal value functions and policies, given the parameters of the MDP. We show that, under certain conditions, this mapping can be considered as a supervised learning problem. We empirically evaluate the effectiveness of our method for zero-shot transfer to new reward and transition dynamics on a series of continuous control tasks from DeepMind Control Suite. Our method demonstrates significant improvements over baselines from multitask and meta RL approaches.
2023-06-26
Proceedings of the AAAI Conference on Artificial Intelligence (published)