Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Multimedia Player
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Multimodal emotion recognition (MMER) systems typically outperform unimodal systems by leveraging the inter-and intra-modal relationships be… (voir plus)tween, e.g., visual, textual, physiological, and auditory modalities. This paper proposes an MMER method that relies on a joint multi-modal transformer (JMT) for fusion with key-based cross-attention. This framework can exploit the complementary nature of diverse modalities to improve predictive accuracy. Separate backbones capture intra-modal spatiotemporal dependencies within each modality over video sequences. Subsequently, our JMT fusion architecture integrates the individual modality embeddings, allowing the model to effectively capture inter- and intra-modal relationships. Extensive experiments on two challenging expression recognition tasks – (1) dimensional emotion recognition on the Affwild2 dataset (with face and voice) and (2) pain estimation on the Biovid dataset (with face and biosensors) – indicate that our JMT fusion can provide a cost-effective solution for MMER. Empirical results show that MMER systems with our proposed fusion allow us to outperform relevant baseline and state-of-the-art methods. Code is available at: https://github.com/PoloWlg/Joint-Multimodal-Transformer-6th-ABAW
2024-06-17
2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (publié)
Accurate, efficient generative models of clinical populations could accelerate clinical research and improve patient outcomes. For example, … (voir plus)such models could infer probable treatment outcomes for different subpopulations, generate high-fidelity synthetic data that can be shared across organizational boundaries, and discover new relationships among clinical variables. Using Bayesian structure learning, we show that it is possible to learn probabilistic program models of clinical populations by combining data from multiple, sparsely overlapping clinical datasets. Through experiments with multiple clinical trials and real-world evidence from census health surveys, we show that our model generates higher quality synthetic data than neural network baselines, supports more accurate inferences across datasets than traditional statistical methods, and can be queried more efficiently than both, opening up new avenues for accessible and efficient AI assistance in clinical research.
Accurate, efficient generative models of clinical populations could accelerate clinical research and improve patient outcomes. For example, … (voir plus)such models could infer probable treatment outcomes for different subpopulations, generate high-fidelity synthetic data that can be shared across organizational boundaries, and discover new relationships among clinical variables. Using Bayesian structure learning, we show that it is possible to learn probabilistic program models of clinical populations by combining data from multiple, sparsely overlapping clinical datasets. Through experiments with multiple clinical trials and real-world evidence from census health surveys, we show that our model generates higher quality synthetic data than neural network baselines, supports more accurate inferences across datasets than traditional statistical methods, and can be queried more efficiently than both, opening up new avenues for accessible and efficient AI assistance in clinical research.
Language Models (LMs) have achieved impressive performance on various linguistic tasks, but their relationship to human language processing … (voir plus)in the brain remains unclear. This paper examines the gaps and overlaps between LMs and the brain at different levels of analysis, emphasizing the importance of looking beyond input-output behavior to examine and compare the internal processes of these systems. We discuss how insights from neuroscience, such as sparsity, modularity, internal states, and interactive learning, can inform the development of more biologically plausible language models. Furthermore, we explore the role of scaling laws in bridging the gap between LMs and human cognition, highlighting the need for efficiency constraints analogous to those in biological systems. By developing LMs that more closely mimic brain function, we aim to advance both artificial intelligence and our understanding of human cognition.
Numerous biological and physical processes can be modeled as systems of interacting samples evolving continuously over time, e.g. the dynami… (voir plus)cs of communicating cells or physical particles.
Flow-based models allow for learning these dynamics at the population level --- they model the evolution of the entire distribution of samples.
However, current flow-based models are limited to a single initial population and a set of predefined conditions which describe different dynamics.
We propose
In recent years, there has been increasing interest in the field of astrophysics in applying Neural Ratio Estimators (NREs) to large-scale i… (voir plus)nference problems where both amortization and marginalization over a large number of nuisance parameters are needed.
Here, in order to assess the true potential of this method to produce unbiased inference on real data, we investigate the robustness of NREs to distribution shifts and model misspecification in the specific scientific application of the measurement of dark matter population-level parameters using strong gravitational lensing. We investigate the behaviour of a trained NRE for test data presenting distributional shifts inside the bounds of training, as well as out of distribution, both in the linear and non-linear parameters of this problem. While our results show that NREs perform when tested perfectly in distribution, we find that they exhibit significant biases and drawbacks when confronted with slight deviations from the examples seen in the training distribution. This indicates the necessity for caution when applying NREs to real astrophysical data, where underlying distributions are not perfectly known and models do not perfectly reconstruct the true underlying distributions.