Towards Assessing Deep Learning Test Input Generators
Seif Mzoughi
Ahmed Haj Yahmed
Mohamed Elshafei
Diego Elias Costa
Trade‐off of different deep learning‐based auto‐segmentation approaches for treatment planning of pediatric craniospinal irradiation autocontouring of OARs for pediatric CSI
Alana Thibodeau‐Antonacci
Marija Popovic
Ozgur Ates
Chia‐Ho Hua
James Schneider
Sonia Skamene
Carolyn Freeman
James Man Git Tsui
As auto‐segmentation tools become integral to radiotherapy, more commercial products emerge. However, they may not always suit our needs. … (voir plus)One notable example is the use of adult‐trained commercial software for the contouring of organs at risk (OARs) of pediatric patients.
View-Dependent Deformation Fields for 2D Editing of 3D Models
Martin El Mqirmi
Why do LLMs attend to the first token?
Federico Barbero
'Alvaro Arroyo
Xiangming Gu
Christos Perivolaropoulos
Michael M. Bronstein
Petar Veličković
NoProp: Training Neural Networks without Back-propagation or Forward-propagation
Qinyu Li
Yee Whye Teh
NoProp: Training Neural Networks without Back-propagation or Forward-propagation
Qinyu Li
Yee Whye Teh
Language-Guided Trajectory Traversal in Disentangled Stable Diffusion Latent Space for Factorized Medical Image Generation
Zahra Tehraninasab
Amar Kumar
Language-Guided Trajectory Traversal in Disentangled Stable Diffusion Latent Space for Factorized Medical Image Generation
Zahra Tehraninasab
Amar Kumar
Leveraging Vision-Language Foundation Models to Reveal Hidden Image-Attribute Relationships in Medical Imaging
Amar Kumar
Anita Kriz
B. Pertzov
Leveraging Vision-Language Foundation Models to Reveal Hidden Image-Attribute Relationships in Medical Imaging
Amar Kumar
Anita Kriz
B. Pertzov
Steering CLIP's vision transformer with sparse autoencoders
Sonia Joseph
Praneet Suresh
Ethan Goldfarb
Lorenz Hufe
Yossi Gandelsman
Robert Graham
Wojciech Samek
While vision models are highly capable, their internal mechanisms remain poorly understood-- a challenge which sparse autoencoders (SAEs) ha… (voir plus)ve helped address in language, but which remains underexplored in vision. We address this gap by training SAEs on CLIP's vision transformer and uncover key differences between vision and language processing, including distinct sparsity patterns for SAEs trained across layers and token types. We then provide the first systematic analysis of the steerability of CLIP's vision transformer by introducing metrics to quantify how precisely SAE features can be steered to affect the model's output. We find that 10-15% of neurons and features are steerable, with SAEs providing thousands more steerable features than the base model. Through targeted suppression of SAE features, we then demonstrate improved performance on three vision disentanglement tasks (CelebA, Waterbirds, and typographic attacks), finding optimal disentanglement in middle model layers, and achieving state-of-the-art performance on defense against typographic attacks. We release our CLIP SAE models and code to support future research in vision transformer interpretability.
Bridging biodiversity and ecosystem services through useful plant species
Nina Obiar
Isaac Eckert
Janelle Baker
Daniel Moerman