Un incubateur à temps plein de 4 mois à Mila, conçu spécifiquement pour les fondateurs et fondatrices de la deep tech issus de parcours d'élite en STIM.
Avantage IA : productivité dans la fonction publique
Apprenez à tirer parti de l’IA générative pour soutenir et améliorer votre productivité au travail. La prochaine cohorte se déroulera en ligne les 28 et 30 avril 2026.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Lecteur Multimédia
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Eya Cherif
Collaborateur·rice de recherche - Leipzig University
Abstract. Large-scale mapping of plant biophysical and biochemical traits is essential for ecological and environmental applications. Given … (voir plus)their finer spectral resolution and unprecedented data availability, hyperspectral data, in concert with machine and particularly deep learning models, have emerged as a promising, non-destructive tool for accurately retrieving these traits. However, when deploying these methods on a large scale, reliably quantifying the associated uncertainty remains a critical challenge, especially when models encounter out-of-domain (OOD) data, i.e., samples that differ substantially from those of the training data, such as unseen geographical regions, species, biomes, data acquisition modalities, or scene components (e.g., clouds and water bodies). Traditional uncertainty quantification methods for deep learning models, including deep ensembles (deterministic and probabilistic) and Monte Carlo dropout, rely on the variance of predictions but often fail to capture uncertainty in OOD scenarios, leading to overly optimistic and possibly misleading uncertainty estimates. To address this limitation, we propose a distance-based uncertainty estimation method (Dis_UN) that quantifies prediction uncertainty by measuring the dissimilarity in the predictor space (spectral inputs) and embedding space (features learned by the deep model) between the training and test data. Dis_UN leverages residuals as a proxy for uncertainty and employs dissimilarity indices in data manifolds to estimate worst-case errors via 95-quantile regression. We evaluate Dis_UN using a pretrained deep learning model to predict multiple plant traits from hyperspectral images, analyzing its performance across OOD data, such as pixels containing spectral variations from urban surfaces, bare ground, water, clouds, or open surface waters. In this study, we target six leaf and canopy traits: leaf mass per area, chlorophylls, carotenoids, nitrogen content, equivalent water thickness, and leaf area index. Compared to scaled variance-based methods, Dis_UN provides (1) a superior estimation of uncertainty in OOD scenarios, achieving 36 % higher contrast (KS distances: 0.648 vs. 0.475) between non-vegetation pixels, particularly under mixed-pixel conditions at medium resolution (30 m); (2) uncertainty quantification without requiring normality or symmetry assumptions, accommodating asymmetric error patterns; (3) enhanced interpretability of uncertainty sources, as uncertainty is directly linked to sample dissimilarity from the training data; and (4) computational efficiency at inference (2.6–7.7× faster), requiring only a single forward pass compared to multiple passes for ensemble-based methods. Challenges remain for traits that are affected by spectral saturation. These findings highlight the advantages of distance-aware uncertainty quantification methods and underscore the necessity of diverse training datasets to minimize sampling biases and enhance model robustness. The proposed framework improves the reliability of uncertainty estimation in vegetation monitoring and offers a promising approach for broader applications.
Plant traits such as leaf carbon content and leaf mass are essential variables in the study of biodiversity and climate change. However, con… (voir plus)ventional field sampling cannot feasibly cover trait variation at ecologically meaningful spatial scales. Machine learning represents a valuable solution for plant trait prediction across ecosystems, leveraging hyperspectral data from remote sensing. Nevertheless, trait prediction from hyperspectral data is challenged by label scarcity and substantial domain shifts (\eg across sensors, ecological distributions), requiring robust cross-domain methods. Here, we present GreenHyperSpectra, a pretraining dataset encompassing real-world cross-sensor and cross-ecosystem samples designed to benchmark trait prediction with semi- and self-supervised methods. We adopt an evaluation framework encompassing in-distribution and out-of-distribution scenarios. We successfully leverage GreenHyperSpectra to pretrain label-efficient multi-output regression models that outperform the state-of-the-art supervised baseline. Our empirical analyses demonstrate substantial improvements in learning spectral representations for trait prediction, establishing a comprehensive methodological framework to catalyze research at the intersection of representation learning and plant functional traits assessment. All code and data are available at: https://github.com/echerif18/HyspectraSSL.