Opening Conference | Building Safer AI for Youth Mental Health
On March 16, join leading AI researchers, clinical experts, and voices from the ground for an event exploring the frameworks needed to design AI that is not only powerful, but also safe for mental health.
TRAIL: Responsible AI for Professionals and Leaders
Learn how to integrate responsible AI practices into your organization with TRAIL. Join our information session on March 12, where you’ll discover the program in detail and have the chance to ask all your questions.
We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Publications
Training Matters: Unlocking Potentials of Deeper Graph Convolutional Neural Networks
Human activity recognition (HAR) is a popular research field in computer vision that has already been widely studied. However, it is still a… (see more)n active research field since it plays an important role in many current and emerging real-world intelligent systems, like visual surveillance and human-computer interaction. Deep reinforcement learning (DRL) has recently been used to address the activity recognition problem with various purposes, such as finding attention in video data or obtaining the best network structure. DRL-based HAR has only been around for a short time, and it is a challenging, novel field of study. Therefore, to facilitate further research in this area, we have constructed a comprehensive survey on activity recognition methods that incorporate DRL. Throughout the article, we classify these methods according to their shared objectives and delve into how they are ingeniously framed within the DRL framework. As we navigate through the survey, we conclude by shedding light on the prominent challenges and lingering questions that await the attention of future researchers, paving the way for further advancements and breakthroughs in this exciting domain.
2024-02-19
IEEE Transactions on Neural Networks and Learning Systems (published)
Human activity recognition (HAR) is a popular research field in computer vision that has already been widely studied. However, it is still a… (see more)n active research field since it plays an important role in many current and emerging real-world intelligent systems, like visual surveillance and human-computer interaction. Deep reinforcement learning (DRL) has recently been used to address the activity recognition problem with various purposes, such as finding attention in video data or obtaining the best network structure. DRL-based HAR has only been around for a short time, and it is a challenging, novel field of study. Therefore, to facilitate further research in this area, we have constructed a comprehensive survey on activity recognition methods that incorporate DRL. Throughout the article, we classify these methods according to their shared objectives and delve into how they are ingeniously framed within the DRL framework. As we navigate through the survey, we conclude by shedding light on the prominent challenges and lingering questions that await the attention of future researchers, paving the way for further advancements and breakthroughs in this exciting domain.
2024-02-19
IEEE Transactions on Neural Networks and Learning Systems (published)
In this blogpost we discuss the idea of teaching neural networks to reach fixed points when reasoning. Specifically, on the algorithmic reas… (see more)oning benchmark CLRS the current neural networks are told the number of reasoning steps they need. While a quick fix is to add a termination network that predicts when to stop, a much more salient inductive bias is that the neural network shouldn't change it's answer any further once the answer is correct, i.e. it should reach a fixed point. This is supported by denotational semantics, which tells us that while loops that terminate are the minimum fixed points of a function. We implement this idea with the help of deep equilibrium models and discuss several hurdles one encounters along the way. We show on several algorithms from the CLRS benchmark the partial success of this approach and the difficulty in making it work robustly across all algorithms.
In this blogpost we discuss the idea of teaching neural networks to reach fixed points when reasoning. Specifically, on the algorithmic reas… (see more)oning benchmark CLRS the current neural networks are told the number of reasoning steps they need. While a quick fix is to add a termination network that predicts when to stop, a much more salient inductive bias is that the neural network shouldn't change it's answer any further once the answer is correct, i.e. it should reach a fixed point. This is supported by denotational semantics, which tells us that while loops that terminate are the minimum fixed points of a function. We implement this idea with the help of deep equilibrium models and discuss several hurdles one encounters along the way. We show on several algorithms from the CLRS benchmark the partial success of this approach and the difficulty in making it work robustly across all algorithms.
Generative Flow Networks (GFlowNets) are a new family of probabilistic samplers where an agent learns a stochastic policy for generating com… (see more)plex combinatorial structure through a series of decision-making steps. Despite being inspired from reinforcement learning, the current GFlowNet framework is relatively limited in its applicability and cannot handle stochasticity in the reward function. In this work, we adopt a distributional paradigm for GFlowNets, turning each flow function into a distribution, thus providing more informative learning signals during training. By parameterizing each edge flow through their quantile functions, our proposed \textit{quantile matching} GFlowNet learning algorithm is able to learn a risk-sensitive policy, an essential component for handling scenarios with risk uncertainty. Moreover, we find that the distributional approach can achieve substantial improvement on existing benchmarks compared to prior methods due to our enhanced training algorithm, even in settings with deterministic rewards.
We consider the problem of sampling from a discrete and structured distribution as a sequential decision problem, where the objective is to … (see more)find a stochastic policy such that objects are sampled at the end of this sequential process proportionally to some predefined reward. While we could use maximum entropy Reinforcement Learning (MaxEnt RL) to solve this problem for some distributions, it has been shown that in general, the distribution over states induced by the optimal policy may be biased in cases where there are multiple ways to generate the same object. To address this issue, Generative Flow Networks (GFlowNets) learn a stochastic policy that samples objects proportionally to their reward by approximately enforcing a conservation of flows across the whole Markov Decision Process (MDP). In this paper, we extend recent methods correcting the reward in order to guarantee that the marginal distribution induced by the optimal MaxEnt RL policy is proportional to the original reward, regardless of the structure of the underlying MDP. We also prove that some flow-matching objectives found in the GFlowNet literature are in fact equivalent to well-established MaxEnt RL algorithms with a corrected reward. Finally, we study empirically the performance of multiple MaxEnt RL and GFlowNet algorithms on multiple problems involving sampling from discrete distributions.
This paper explores feature prediction as a stand-alone objective for unsupervised learning from video and introduces V-JEPA, a collection o… (see more)f vision models trained solely using a feature prediction objective, without the use of pretrained image encoders, text, negative examples, reconstruction, or other sources of supervision. The models are trained on 2 million videos collected from public datasets and are evaluated on downstream image and video tasks. Our results show that learning by predicting video features leads to versatile visual representations that perform well on both motion and appearance-based tasks, without adaption of the model's parameters; e.g., using a frozen backbone. Our largest model, a ViT-H/16 trained only on videos, obtains 81.9% on Kinetics-400, 72.2% on Something-Something-v2, and 77.9% on ImageNet1K.
This paper explores feature prediction as a stand-alone objective for unsupervised learning from video and introduces V-JEPA, a collection o… (see more)f vision models trained solely using a feature prediction objective, without the use of pretrained image encoders, text, negative examples, reconstruction, or other sources of supervision. The models are trained on 2 million videos collected from public datasets and are evaluated on downstream image and video tasks. Our results show that learning by predicting video features leads to versatile visual representations that perform well on both motion and appearance-based tasks, without adaption of the model's parameters; e.g., using a frozen backbone. Our largest model, a ViT-H/16 trained only on videos, obtains 81.9% on Kinetics-400, 72.2% on Something-Something-v2, and 77.9% on ImageNet1K.
Learning time-series representations for discriminative tasks, such as classification and regression, has been a long-standing challenge in … (see more)the healthcare domain. Current pre-training methods are limited in either unidirectional next-token prediction or randomly masked token prediction. We propose a novel architecture called Bidirectional Timely Generative Pre-trained Transformer (BiTimelyGPT), which pre-trains on biosignals and longitudinal clinical records by both next-token and previous-token prediction in alternating transformer layers. This pre-training task preserves original distribution and data shapes of the time-series. Additionally, the full-rank forward and backward attention matrices exhibit more expressive representation capabilities. Using biosignals and longitudinal clinical records, BiTimelyGPT demonstrates superior performance in predicting neurological functionality, disease diagnosis, and physiological signs. By visualizing the attention heatmap, we observe that the pre-trained BiTimelyGPT can identify discriminative segments from biosignal time-series sequences, even more so after fine-tuning on the task.