Join us on the Venture Scientist Bootcamp, a full time, 4-month incubator at Mila, built specifically for deep tech founders with elite STEM backgrounds.
Learn how to leverage generative AI to support and improve your productivity at work. The next cohort will take place online on April 28 and 30, 2026, in French.
We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Medical imaging systems increasingly rely on large vision language foundation models (VLFMs) trained on diverse biomedical corpora, yet thes… (see more)e models remain difficult to adapt to new clinical tasks without costly fine-tuning and large annotated datasets. We present PIKACHU (Prototypical In-Context Knowledge Adaptation for Clinical Heterogeneous Usage), a lightweight and generalizable framework that enables rapid few-shot adaptation of frozen medical FMs using only a handful of labeled examples. Unlike prior approaches that modify backbone weights or introduce heavy attention-based adapters, PIKACHU performs all task adaptation directly in the FM feature space through in-context prototypical reasoning. Given a small support set, the framework constructs class prototypes by averaging normalized embeddings from a frozen VLFM image encoder and performs prediction on query images using temperature-scaled cosine similarity. Only a single temperature parameter is learned. We evaluate PIKACHU across three heterogeneous medical imaging datasets - dermatological images (ISIC), Optical Coherence Tomography (OCT), and Diabetic Retinopathy (DR), using established vision models (SigLIP, PubMedCLIP, DinoV2, and ViT) as backbones. The proposed in-context learning (ICL) strategy consistently outperforms the baseline (zero-shot) approaches across all datasets and architectures, achieving substantial improvements in both accuracy and AUC. Notably, with PubMedCLIP as the backbone, PIKACHU achieves 0.69/0.76 (Acc./AUC) on the ISIC dataset, 0.72/0.78 on OCT, and 0.79/0.88 on DR, demonstrating robust generalization across diverse clinical imaging modalities. These results highlight the promise of feature-space in-context learning as efficient and deployable paradigms for test-time adaptation of foundation models, without the need for extensive retraining.
Capturing diversity is crucial in conditional and prompt-based image generation, particularly when conditions contain uncertainty that can l… (see more)ead to multiple plausible outputs. To generate diverse images reflecting this diversity, traditional methods often modify random seeds, making it difficult to discern meaningful differences between samples, or diversify the input prompt, which is limited in verbally interpretable diversity. We propose Rainbow, a novel conditional image generation framework, applicable to any pretrained conditional generative model, that addresses inherent condition/prompt uncertainty and generates diverse plausible images. Rainbow is based on a simple yet effective idea: decomposing the input condition into diverse latent representations, each capturing an aspect of the uncertainty and generating a distinct image. First, we integrate a latent graph, parameterized by Generative Flow Networks (GFlowNets), into the prompt representation computation. Second, leveraging GFlowNets' advanced graph sampling capabilities to capture uncertainty and output diverse trajectories over the graph, we produce multiple trajectories that collectively represent the input condition, leading to diverse condition representations and corresponding output images. Evaluations on natural image and medical image datasets demonstrate Rainbow's improvement in both diversity and fidelity across image synthesis, image generation, and counterfactual generation tasks.