Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Multimedia Player
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
Inter- and intra-year forest change detection and monitoring of aboveground biomass dynamics using Sentinel-2 and Landsat
In the era of proliferation of large language and image generation models, the phenomenon of "model collapse" refers to the situation whereb… (voir plus)y as a model is trained recursively on data generated from previous generations of itself over time, its performance degrades until the model eventually becomes completely useless, i.e the model collapses. In this work, we study this phenomenon in the setting of high-dimensional regression and obtain analytic formulae which quantitatively outline this phenomenon in a broad range of regimes. In the special case of polynomial decaying spectral and source conditions, we obtain modified scaling laws which exhibit new crossover phenomena from fast to slow rates. We also propose a simple strategy based on adaptive regularization to mitigate model collapse. Our theoretical results are validated with experiments.
A large number of tutorials for popular software development technologies are available online, and those about the same technology vary wid… (voir plus)ely in their presentation. We studied the design of tutorials in the software documentation landscape for five popular programming languages: Java, C#, Python, Javascript, and Typescript. We investigated the extent to which tutorial pages, i.e. resources, differ and report statistics of variations in resource properties. We developed a framework for characterizing resources based on their distinguishing attributes, i.e. properties that vary widely for the resource, relative to other resources. Additionally, we propose that a resource can be represented by its resource style, i.e. the combination of its distinguishing attributes. We discuss three techniques for characterizing resources based on our framework, to capture notable and relevant content and presentation properties of tutorial pages. We apply these techniques on a data set of 2551 resources to validate that our framework identifies valid and interpretable styles. We contribute this framework for reasoning about the design of resources in the online software documentation landscape.
2024-02-01
IEEE Transactions on Software Engineering (publié)
A large number of tutorials for popular software development technologies are available online, and those about the same technology vary wid… (voir plus)ely in their presentation. We studied the design of tutorials in the software documentation landscape for five popular programming languages: Java, C#, Python, Javascript, and Typescript. We investigated the extent to which tutorial pages, i.e. resources, differ and report statistics of variations in resource properties. We developed a framework for characterizing resources based on their distinguishing attributes, i.e. properties that vary widely for the resource, relative to other resources. Additionally, we propose that a resource can be represented by its resource style, i.e. the combination of its distinguishing attributes. We discuss three techniques for characterizing resources based on our framework, to capture notable and relevant content and presentation properties of tutorial pages. We apply these techniques on a data set of 2551 resources to validate that our framework identifies valid and interpretable styles. We contribute this framework for reasoning about the design of resources in the online software documentation landscape.
2024-02-01
IEEE Transactions on Software Engineering (publié)