Publications

Advancements in Affective and Behavior Analysis: The 8th ABAW Workshop and Competition

Dimitrios Kollias

Panagiotis Tzirakis

Alan Cowen

Stefanos Zafeiriou

Irene Kotsia

Eric Granger

Marco Pedersoli

Simon Bacon

Alice Baird

Chris Gagne

Chunchang Shao

Guanyu Hu

Soufiane Belharbi

Muhammad Haseeb Aslam

2024-12-31

CVPR Workshops (published)

doi.org

Advocacy for Children With Surgical Diseases in Nigeria: National Policy Status, Gaps, and Solutions

Justina O. Seyi-Olajide

Ayla Gerk

Elena Guadagno

Adesoji Ademuyiwa

Emmanuel A. Ameh

Dan Poenaru

2024-12-31

Journal of Pediatric Surgery (published)

doi.org

AFRIDOC-MT: Document-level MT Corpus for African Languages

Jesujoba Oluwadara Alabi

Israel Abebe Azime

Miaoran Zhang

Cristina España-Bonet

Rachel Bawden

Dawei Zhu

David Ifeoluwa Adelani

Clement Odoje

Idris Akinade

Iffat Maab

Davis David

Shamsuddeen Hassan Muhammad

Neo Putini

David O. Ademuyiwa

Andrew Caines

Dietrich Klakow

This paper introduces AFRIDOC-MT, a document-level multi-parallel translation dataset covering English and five African languages: Amharic, … (see more)Hausa, Swahili, Yor\`ub\'a, and Zulu. The dataset comprises 334 health and 271 information technology news documents, all human-translated from English to these languages. We conduct document-level translation benchmark experiments by evaluating neural machine translation (NMT) models and large language models (LLMs) for translations between English and these languages, at both the sentence and pseudo-document levels. These outputs are realigned to form complete documents for evaluation. Our results indicate that NLLB-200 achieved the best average performance among the standard NMT models, while GPT-4o outperformed general-purpose LLMs. Fine-tuning selected models led to substantial performance gains, but models trained on sentences struggled to generalize effectively to longer documents. Furthermore, our analysis reveals that some LLMs exhibit issues such as under-generation, repetition of words or phrases, and off-target translations, especially for African languages.

2024-12-31

EMNLP (published)

doi.org

arxiv.org

Anatomically-Focused Patches for Lightweight and Explainable Knee OA Grading

Tien-en Chang

Hervé Lombaert

2024-12-31

ShapeMI@MICCAI (published)

doi.org

Anticancer Monotherapy and Polytherapy Drug Response Prediction Using Deep Learning: Guidelines and Best Practices

Amin Emad

David Earl Hostallero

2024-12-31

Methods in Molecular Biology (unknown)

doi.org

Anti-patterns and Code Smells for Multi-language Systems

Mouna Abidi

Manel Grichi

Foutse Khomh

Yann‐Gaël Guéhéneuc

2024-12-31

Transactions on Pattern Languages of Programming (published)

doi.org

Attention as a Hypernetwork

Simon Schug

Seijin Kobayashi

Yassir Akram

João Sacramento

Razvan Pascanu

Transformers can under some circumstances generalize to novel problem instances whose constituent parts might have been encountered during t… (see more)raining, but whose compositions have not. What mechanisms underlie this ability for compositional generalization? By reformulating multi-head attention as a hypernetwork, we reveal that a composable, low-dimensional latent code specifies key-query specific operations. We find empirically that this latent code is predictive of the subtasks the network performs on unseen task compositions, revealing that latent codes acquired during training are reused to solve unseen problem instances. To further examine the hypothesis that the intrinsic hypernetwork of multi-head attention supports compositional generalization, we ablate whether making the hypernetwork-generated linear value network nonlinear strengthens compositionality. We find that this modification improves compositional generalization on abstract reasoning tasks. In particular, we introduce a symbolic version of the Raven's Progressive Matrices human intelligence test, which gives us precise control over the problem compositions encountered during training and evaluation. We demonstrate on this task how scaling model size and data enables compositional generalization in transformers and gives rise to a functionally structured latent space.

2024-12-31

ICLR (published)

doi.org

arxiv.org

Audio Prototypical Network For Controllable Music Recommendation

Traditional recommendation systems represent user preferences in dense representations obtained through black-box encoder models. While thes… (see more)e models often provide strong recommendation performance, they lack interpretability for users, leaving users unable to understand or control the system's modeling of their preferences. This limitation is especially challenging in music recommendation, where user preferences are highly personal and often evolve based on nuanced qualities like mood, genre, tempo, or instrumentation. In this paper, we propose an audio prototypical network for controllable music recommendation. This network expresses user preferences in terms of prototypes representative of semantically meaningful features pertaining to musical qualities. We show that the model obtains competitive recommendation performance compared to popular baseline models while also providing interpretable and controllable user profiles.

2024-12-31

MLSP (published)

doi.org

openreview.net

AURA: A Multi-modal Medical Agent for Understanding, Reasoning and Annotation

Nima Fathi

Amar Kumar

Tal Arbel

2024-12-31

Agentic AI/CREATE/Clinical MLLMs@MICCAI (published)

doi.org

Automated UML Visualization of Software Ecosystems: Tracking Versions, Dependencies, and Security Updates

Vanessa Kan

M. P. Lnu

Solomon Berhe

C. El Kari

Marc Maynard

Foutse Khomh

2024-12-31

ANT/EDI40 (published)

doi.org

Balancing Profit and Fairness in Risk-Based Pricing Markets

Jesse Thibodeau

Hadi Nekoei

Afaf Taïk

Janarthanan Rajendran

Golnoosh Farnadi

Dynamic, risk-based pricing can systematically exclude vulnerable consumer groups from essential resources such as health insurance and cons… (see more)umer credit. We show that a regulator can realign private incentives with social objectives through a learned, interpretable tax schedule. First, we provide a formal proposition that bounding each firm's \emph{local} demographic gap implicitly bounds the \emph{global} opt-out disparity, motivating firm-level penalties. Building on this insight we introduce \texttt{MarketSim} -- an open-source, scalable simulator of heterogeneous consumers and profit-maximizing firms -- and train a reinforcement learning (RL) social planner (SP) that selects a bracketed fairness-tax while remaining close to a simple linear prior via an

2024-12-31

arXiv (preprint)

doi.org

arxiv.org

Beyond Model Collapse: Scaling Up with Synthesized Data Requires Verification

Yunzhen Feng

Elvis Dohmatob

Pu Yang

Francois Charton

Julia Kempe

Large Language Models (LLM) are increasingly trained on data generated by other LLM, either because generated text and images become part of… (see more) the pre-training corpus, or because synthetized data is used as a replacement for expensive human-annotation. This raises concerns about \emph{model collapse}, a drop in model performance when their training sets include generated data. Considering that it is easier for both humans and machines to tell between good and bad examples than to generate high-quality samples, we investigate the use of verification on synthesized data to prevent model collapse. We provide a theoretical characterization using Gaussian mixtures, linear classifiers, and linear verifiers to derive conditions with measurable proxies to assess whether the verifier can effectively select synthesized data that leads to optimal performance. We experiment with two practical tasks -- computing matrix eigenvalues with transformers and news summarization with LLMs -- which both exhibit model collapse when trained on generated data, and show that verifiers, even imperfect ones, can indeed be harnessed to prevent model collapse and that our proposed proxy measure strongly correlates with performance.

2024-12-31

ICLR (published)

doi.org

openreview.net

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications