Publications

Supervised Large Neighbourhood Search for MIPs

Charly Robinson La Rocca

Jean-François Cordeau

Large Neighbourhood Search (LNS) is a powerful heuristic framework for solving Mixed-Integer Programming (MIP) problems. However, designing … (see more)effective variable selection strategies in LNS remains challenging, especially for diverse sets of problems. In this paper, we propose an approach that integrates Machine Learning (ML) within the destroy operator of LNS for MIPs with a focus on minimal offline training. We implement a modular LNS matheuristic as a test bench to compare different LNS heuristics, including our ML-enhanced LNS. Experimental results on the MIPLIB 2017 dataset demonstrate that the matheuristic can significantly improve the performance of state-of-the-art solvers like Gurobi and SCIP. We conduct analyses on noisy oracles to explore the impact of prediction accuracy on solution quality. Additionally, we develop techniques to enhance the ML model through loss adjustments and sampling routines. Our findings suggest that while random LNS remains competitive, our Supervised LNS (SLNS) outperforms other baselines and helps set the foundation for future research on ML for LNS methods that are both efficient and general.

2025-01-18

ArXiv (preprint)

arxiv.org

Multi-center benchmarking of cervical spinal cord RF coils for 7 T MRI: A traveling spines study

Eva Alonso‐Ortiz

Daniel Papp

Robert L. Barry

Kyota Poëti

Alan C. Seifert

Kyle M. Gilbert

Nibardo Lopez‐Rios

Jan Paska

Falk Eippert

N. Weiskopf

Laura Beghini

Nadine Graedel

Robert Trampel

M. F. Callaghan

Christoph S Aigner

Patrick Freund

Maryam Seif

A. Destruel

Virginie Callot

Johanna Vannesjo … (see 1 more)

Julien Cohen-Adad

Purpose The depth within the body, small diameter, long length, and varying tissue surrounding the spinal cord impose specific consideration… (see more)s when designing radiofrequency coils. The optimal coil configuration for 7 T cervical spinal cord MRI is unknown and, currently, there are very few coil options. The purpose of this work was (1) to establish a quality control protocol for evaluating 7 T cervical spinal cord coils and (2) to use that protocol to evaluate the performance of 4 different coil designs. Methods Three healthy volunteers and a custom anthropomorphic phantom (the traveling spines cohort) were scanned at seven 7 T imaging centers using a common protocol and each center’s specific cervical spinal cord coil. Four different coil designs were tested (two in-house, one Rapid Biomedical, and one MRI.TOOLS design). Results The Rapid Biomedical coil was found to have the highest B1+ efficiency, whereas one of the in-house designs (NeuroPoly Lab) had the highest SNR and the largest spinal cord coverage. The MRI.TOOLS coil had the most uniform B1+ profile along the cervical spinal cord; however, it was limited in its ability to provide the requested flip angles (especially for larger individuals). The latter was also the case for the second in-house coil (MSSM). Conclusion The results of this study serve as a guide for the spinal cord MRI community in selecting the most suitable coil based on specific requirements and offer a standardized protocol for assessing future coils.

2025-01-17

bioRxiv (preprint)

doi.org

Maximizing Data and Hardware Reuse for HLS with Early-Stage Symbolic Partitioning

Tzung-Han Juang

Christophe Dubach

While traditional HLS (High-Level Synthesis) converts “high-level” C-like programs into hardware automatically, producing high-performan… (see more)ce designs still requires hardware expertise. Optimizations such as data partitioning can have a large impact on performance since they directly affect data reuse patterns and the ability to reuse hardware. However, optimizing partitioning is a difficult process since minor changes in the parameter choices can lead to totally unpredictable performance. Functional array-based languages have been proposed instead of C-based approaches, as they offer stronger performance guarantees. This paper proposes to follow a similar approach and exposes a divide-and-conquer primitive at the algorithmic level to let users partition any arbitrary computation. The compiler is then free to explore different partition shapes to maximize both data and hardware reuse automatically. The main challenge remains that the impact of partitioning is only known much later in the compilation flow. This is due to the hard-to-predict effects of the many optimizations applied during compilation. To solve this problem, the partitioning is expressed using a set of symbolic tunable parameters, introduced early in the compilation pipeline. A symbolic performance model is then used in the last compilation stage to predict performance based on the possible values of the tunable parameters. Using this approach, a design space exploration is conducted on an Intel Arria 10 FPGAs (Field Programmable Gate Arrays), and competitive performance is achieved on the classical VGG and TinyYolo neural networks.

2025-01-16

ACM Transactions on Architecture and Code Optimization (TACO) (published)

doi.org

Robin: a Suite of Multi-Scale Vision-Language Models and the CHIRP Evaluation Benchmark

Alexis Roger

Prateek Humane

Daniel Z Kaplan

Kshitij Gupta

Qirui Sun

George Adamopoulos

Jonathan Siu Chi Lim

Quentin Gregory Anthony

Edwin Fennell

Irina Rish

The proliferation of Vision-Language Models (VLMs) in the past several years calls for rigorous and comprehensive evaluation methods and ben… (see more)chmarks. This work analyzes existing VLM evaluation techniques, including automated metrics, AI-based assessments, and human evaluations across diverse tasks. We first introduce Robin - a novel suite of VLMs that we built by combining Large Language Models (LLMs) and Vision Encoders (VEs) at multiple scales, and use Robin to identify shortcomings of current evaluation approaches across scales. Next, to overcome the identified limitations, we introduce CHIRP - a new long form response benchmark we developed for more robust and complete VLM evaluation. We provide open access to the Robin training code, model suite, and CHIRP benchmark to promote reproducibility and advance VLM research.

2025-01-16

ArXiv (preprint)

doi.org

arxiv.org

AfriHate: A Multilingual Collection of Hate Speech and Abusive Language Datasets for African Languages

Shamsuddeen Hassan Muhammad

Idris Abdulmumin

Abinew Ayele

David Ifeoluwa Adelani

Ibrahim Ahmad

Saminu Mohammad Aliyu

Nelson Odhiambo Onyango

Lilian D. A. Wanzare

Samuel Rutunda

Lukman Jibril Aliyu

Esubalew Alemneh

Oumaima Hourrane

Hagos Gebremichael

Elyas Abdi Ismail

Meriem Beloucif

Ebrahim Chekol Jibril

Andiswa Bukula

Rooweither Mabuya

Salomey Osei

Abigail Oppong … (see 7 more)

Tadesse Belay

Tadesse Kebede Guge

Tesfa Tegegne Asfaw

Chiamaka Ijeoma Chukwuneke

Paul Rottger

Seid Muhie Yimam

Nedjma OUSIDHOUM

Hate speech and abusive language are global phenomena that need socio-cultural background knowledge to be understood, identified, and modera… (see more)ted. However, in many regions of the Global South, there have been several documented occurrences of (1) absence of moderation and (2) censorship due to the reliance on keyword spotting out of context. Further, high-profile individuals have frequently been at the center of the moderation process, while large and targeted hate speech campaigns against minorities have been overlooked. These limitations are mainly due to the lack of high-quality data in the local languages and the failure to include local communities in the collection, annotation, and moderation processes. To address this issue, we present AfriHate: a multilingual collection of hate speech and abusive language datasets in 15 African languages. Each instance in AfriHate is annotated by native speakers familiar with the local culture. We report the challenges related to the construction of the datasets and present various classification baseline results with and without using LLMs. The datasets, individual annotations, and hate speech and offensive language lexicons are available on https://github.com/AfriHate/AfriHate

2025-01-14

ArXiv (preprint)

doi.org

arxiv.org

AfriHate: A Multilingual Collection of Hate Speech and Abusive Language Datasets for African Languages

Shamsuddeen Hassan Muhammad

Idris Abdulmumin

Abinew Ayele

David Ifeoluwa Adelani

Ibrahim Ahmad

Saminu Mohammad Aliyu

Nelson Odhiambo Onyango

Lilian D. A. Wanzare

Samuel Rutunda

Lukman Jibril Aliyu

Esubalew Alemneh

Oumaima Hourrane

Hagos Gebremichael

Elyas Abdi Ismail

Meriem Beloucif

Ebrahim Chekol Jibril

Andiswa Bukula

Rooweither Mabuya

Salomey Osei

Abigail Oppong … (see 7 more)

Tadesse Belay

Tadesse Kebede Guge

Tesfa Tegegne Asfaw

Chiamaka Ijeoma Chukwuneke

Paul Rottger

Seid Muhie Yimam

Nedjma OUSIDHOUM

Hate speech and abusive language are global phenomena that need socio-cultural background knowledge to be understood, identified, and modera… (see more)ted. However, in many regions of the Global South, there have been several documented occurrences of (1) absence of moderation and (2) censorship due to the reliance on keyword spotting out of context. Further, high-profile individuals have frequently been at the center of the moderation process, while large and targeted hate speech campaigns against minorities have been overlooked. These limitations are mainly due to the lack of high-quality data in the local languages and the failure to include local communities in the collection, annotation, and moderation processes. To address this issue, we present AfriHate: a multilingual collection of hate speech and abusive language datasets in 15 African languages. Each instance in AfriHate is annotated by native speakers familiar with the local culture. We report the challenges related to the construction of the datasets and present various classification baseline results with and without using LLMs. The datasets, individual annotations, and hate speech and offensive language lexicons are available on https://github.com/AfriHate/AfriHate

2025-01-14

ArXiv (preprint)

arxiv.org

EarthView: A Large Scale Remote Sensing Dataset for Self-Supervision

Diego Velazquez

Pau Rodriguez

Sergio Alonso

Josep M. Gonfaus

Jordi Gonzàlez 0001

Gerardo Richarte

Javier Marin

Yoshua Bengio

Alexandre Lacoste

This paper presents EarthView, a comprehensive dataset specifically designed for self-supervision on remote sensing data, intended to enhanc… (see more)e deep learning applications on Earth monitoring tasks. The dataset spans 15 tera pixels of global remote-sensing data, combining imagery from a diverse range of sources, including NEON, Sentinel, and a novel release of 1m spatial resolution data from Satellogic. Our dataset provides a wide spectrum of image data with varying resolutions, harnessed from different sensors and organized coherently into an accessible HuggingFace dataset in parquet format. This data spans five years, from 2017 to 2022. Accompanying the dataset, we introduce EarthMAE, a tailored Masked Autoencoder, developed to tackle the distinct challenges of remote sensing data. Trained in a self-supervised fashion, EarthMAE effectively processes different data modalities such as hyperspectral, multispectral, topographical data, segmentation maps, and temporal structure. This model helps us show that pre-training on Satellogic data improves performance on downstream tasks. While there is still a gap to fill in MAE for heterogeneous data, we regard this innovative combination of an expansive, diverse dataset and a versatile model adapted for self-supervised learning as a stride forward in deep learning for Earth monitoring.

2025-01-14

ArXiv (preprint)

arxiv.org

Integrating food webs in species distribution models can improve ecological niche estimation and predictions

Giovanni Poggiato

Jérémy Andréoletti

Laura J. Pollock

Wilfried Thuiller

Biotic interactions play a fundamental role in shaping multitrophic species communities, yet incorporating these interactions into species d… (see more)istribution models (SDMs) remains challenging. With the growing availability of species interaction networks, it is now feasible to integrate these interactions into SDMs for more comprehensive predictions. Here, we propose a novel framework that combines trophic interaction networks with Bayesian structural equation models, enabling each species to be modeled based on its interactions with predators or prey alongside environmental factors. This framework addresses issues of multicollinearity and error propagation, making it possible to predict species distributions in unobserved locations or under future environmental conditions, even when prey or predator distributions are unknown. We tested and validated our framework on realistic simulated communities spanning different theoretical models and ecological setups. scenarios. Our approach significantly improved the estimation of both potential and realized niches compared to single SDMs, with mean performance gains of 8% and 6%, respectively. These improvements were especially notable for species strongly regulated by biotic factors, thereby enhancing model predictive accuracy. Our framework supports integration with various SDM extensions, such as occupancy and integrated models, offering flexibility and adaptability for future developments. While not a universal solution that consistently outperforms single SDMs, our approach provides a valuable new tool for modeling multitrophic community distributions when biotic interactions are known or assumed.

2025-01-14

Ecography (published)

doi.org

Integrating food webs in species distribution models can improve ecological niche estimation and predictions

Giovanni Poggiato

Jérémy Andréoletti

Laura J. Pollock

Wilfried Thuiller

2025-01-14

Ecography (published)

doi.org

Symmetry-Aware Generative Modeling through Learned Canonicalization

Kusha Sareen

Daniel Levy

Arnab Kumar Mondal

Sékou-Oumar Kaba

Tara Akhound-Sadegh

Siamak Ravanbakhsh

Generative modeling of symmetric densities has a range of applications in AI for science, from drug discovery to physics simulations. The ex… (see more)isting generative modeling paradigm for invariant densities combines an invariant prior with an equivariant generative process. However, we observe that this technique is not necessary and has several drawbacks resulting from the limitations of equivariant networks. Instead, we propose to model a learned slice of the density so that only one representative element per orbit is learned. To accomplish this, we learn a group-equivariant canonicalization network that maps training samples to a canonical pose and train a non-equivariant generative model over these canonicalized samples. We implement this idea in the context of diffusion models. Our preliminary experimental results on molecular modeling are promising, demonstrating improved sample quality and faster inference time.

2025-01-14

ArXiv (preprint)

arxiv.org

The oneirogen hypothesis: modeling the hallucinatory effects of classical psychedelics in terms of replay-dependent plasticity mechanisms

Classical psychedelics induce complex visual hallucinations in humans, generating percepts that are co-herent at a low level, but which have… (see more) surreal, dream-like qualities at a high level. While there are many hypotheses as to how classical psychedelics could induce these effects, there are no concrete mechanistic models that capture the variety of observed effects in humans, while remaining consistent with the known pharmacological effects of classical psychedelics on neural circuits. In this work, we propose the “oneirogen hypothesis”, which posits that the perceptual effects of classical psychedelics are a result of their pharmacological actions inducing neural activity states that truly are more similar to dream-like states. We simulate classical psychedelics’ effects via manipulating neural network models trained on perceptual tasks with the Wake-Sleep algorithm. This established machine learning algorithm leverages two activity phases, a perceptual phase (wake) where sensory inputs are encoded, and a generative phase (dream) where the network internally generates activity consistent with stimulus-evoked responses. We simulate the action of psychedelics by partially shifting the model to the ‘Sleep’ state, which entails a greater influence of top-down connections, in line with the impact of psychedelics on apical dendrites. The effects resulting from this manipulation capture a number of experimentally observed phenomena including the emergence of hallucinations, increases in stimulus-conditioned variability, and large increases in synaptic plasticity. We further provide a number of testable predictions which could be used to validate or invalidate our oneirogen hypothesis.

2025-01-13

bioRxiv (preprint)

doi.org

AFRIDOC-MT: Document-level MT Corpus for African Languages

Jesujoba Oluwadara Alabi

Israel Abebe Azime

Miaoran Zhang

Cristina España-Bonet

Rachel Bawden

Dawei Zhu

David Ifeoluwa Adelani

Clement Odoje

Idris Akinade

Iffat Maab

Davis David

Shamsuddeen Hassan Muhammad

Neo Putini

David O. Ademuyiwa

Andrew Caines

Dietrich Klakow

This paper introduces AFRIDOC-MT, a document-level multi-parallel translation dataset covering English and five African languages: Amharic, … (see more)Hausa, Swahili, Yor\`ub\'a, and Zulu. The dataset comprises 334 health and 271 information technology news documents, all human-translated from English to these languages. We conduct document-level translation benchmark experiments by evaluating neural machine translation (NMT) models and large language models (LLMs) for translations between English and these languages, at both the sentence and pseudo-document levels. These outputs are realigned to form complete documents for evaluation. Our results indicate that NLLB-200 achieved the best average performance among the standard NMT models, while GPT-4o outperformed general-purpose LLMs. Fine-tuning selected models led to substantial performance gains, but models trained on sentences struggled to generalize effectively to longer documents. Furthermore, our analysis reveals that some LLMs exhibit issues such as under-generation, repetition of words or phrases, and off-target translations, especially for African languages.

2025-01-10

ArXiv (preprint)

doi.org

arxiv.org

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Publications

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Popular keywords:

Publications