Publications

Variable Star Light Curves in Koopman Space

Mario Pasquato

Gaia Carenini

Nicolas Mekhaël

Vittorio F. Braga

Piero Trevisan

Giuseppe Bono

Yashar Hezaveh

2024-06-16

ICML.cc/2024/Workshop/AI4Science (spotlight)

openreview.net

How Should We Extract Discrete Audio Tokens from Self-Supervised Models?

Pooneh Mousavi

Jarod Duret

Salah Zaiem

Luca Della Libera

Artem Ploujnikov

Cem Subakan

Mirco Ravanelli

2024-06-15

ArXiv (preprint)

doi.org

arxiv.org

Consistency-diversity-realism Pareto fronts of conditional image generative models

Pietro Astolfi

Marlene Careil

Melissa Hall

Oscar Mañas

Matthew Muckley

Jakob Verbeek

Adriana Romero Soriano

Michal Drozdzal

Building world models that accurately and comprehensively represent the real world is the utmost aspiration for conditional image generative… (see more) models as it would enable their use as world simulators. For these models to be successful world models, they should not only excel at image quality and prompt-image consistency but also ensure high representation diversity. However, current research in generative models mostly focuses on creative applications that are predominantly concerned with human preferences of image quality and aesthetics. We note that generative models have inference time mechanisms - or knobs - that allow the control of generation consistency, quality, and diversity. In this paper, we use state-of-the-art text-to-image and image-and-text-to-image models and their knobs to draw consistency-diversity-realism Pareto fronts that provide a holistic view on consistency-diversity-realism multi-objective. Our experiments suggest that realism and consistency can both be improved simultaneously; however there exists a clear tradeoff between realism/consistency and diversity. By looking at Pareto optimal points, we note that earlier models are better at representation diversity and worse in consistency/realism, and more recent models excel in consistency/realism while decreasing significantly the representation diversity. By computing Pareto fronts on a geodiverse dataset, we find that the first version of latent diffusion models tends to perform better than more recent models in all axes of evaluation, and there exist pronounced consistency-diversity-realism disparities between geographical regions. Overall, our analysis clearly shows that there is no best model and the choice of model should be determined by the downstream application. With this analysis, we invite the research community to consider Pareto fronts as an analytical tool to measure progress towards world models.

2024-06-14

ArXiv (preprint)

doi.org

arxiv.org

A Hybrid CNN-Transformer Approach for Continuous Fine Finger Motion Decoding from sEMG Signals

Zihan Weng

Xiabing Zhang

Yufeng Mou

Chanlin Yi

Fali Li

Pouya Bashivan

Peng Xu

This work presents a novel approach that synergistically integrates convolutional neural networks (CNNs) and Transformer models for decoding… (see more) continuous fine finger motions from surface electromyography (sEMG) signals. This integration capitalizes on CNNs’ proficiency in extracting rich temporal and spatial features from multichannel sEMG data and the Transformer’s superior capability in recognizing complex patterns and long-range dependencies. A significant advancement in this field is the use of a custom-developed Epidermal Electrode Array Sleeve (EEAS) for capturing high-fidelity sEMG signals, enabling more accurate and reliable signal acquisition than traditional methods. The decoded joint angles could be used in seamless and intuitive human-machine interaction in various applications, such as virtual reality, augmented reality, robotic control, and prosthetic control. Evaluations demonstrate the superior performance of the proposed CNN-Transformer hybrid architecture in decoding continuous fine finger motions, outperforming individual CNN and Transformer models. The synergistic integration of CNNs and Transformers presents a powerful framework for sEMG decoding, offering exciting opportunities for naturalistic and intuitive human-machine interaction applications. Its robustness and efficiency make it an ideal choice for real-world applications, promising to enhance the interface between humans and machines significantly. The implications of this research extend to advancing the understanding of human neuromuscular signals and their application in computing interfaces.

2024-06-14

2024 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA) (published)

doi.org

Phoneme Discretized Saliency Maps for Explainable Detection of AI-Generated Voice

Shubham Gupta

Mirco Ravanelli

Pascal Germain

Cem Subakan

2024-06-14

ArXiv (preprint)

doi.org

arxiv.org

TGB 2.0: A Benchmark for Learning on Temporal Knowledge Graphs and Heterogeneous Graphs

Julia Gastinger

Shenyang Huang

Mikhail Galkin

Erfan Loghmani

Ali Parviz

Farimah Poursafaei

Jacob Danovitch

Emanuele Rossi

Ioannis Koutis

Heiner Stuckenschmidt

Reihaneh Rabbany

Guillaume Rabusseau

2024-06-14

ArXiv (preprint)

doi.org

arxiv.org

Towards Neural Scaling Laws for Foundation Models on Temporal Graphs

Razieh Shirzadkhani

Tran Gia Bao Ngo

Kiarash Shamsi

Shenyang Huang

Farimah Poursafaei

Poupak Azad

Reihaneh Rabbany

Baris Coskunuzer

Guillaume Rabusseau

Cuneyt Gurcan Akcora

The field of temporal graph learning aims to learn from evolving network data to forecast future interactions. Given a collection of observe… (see more)d temporal graphs, is it possible to predict the evolution of an unseen network from the same domain? To answer this question, we first present the Temporal Graph Scaling (TGS) dataset, a large collection of temporal graphs consisting of eighty-four ERC20 token transaction networks collected from 2017 to 2023. Next, we evaluate the transferability of Temporal Graph Neural Networks (TGNNs) for the temporal graph property prediction task by pre-training on a collection of up to sixty-four token transaction networks and then evaluating the downstream performance on twenty unseen token networks. We find that the neural scaling law observed in NLP and Computer Vision also applies in temporal graph learning, where pre-training on greater number of networks leads to improved downstream performance. To the best of our knowledge, this is the first empirical demonstration of the transferability of temporal graphs learning. On downstream token networks, the largest pre-trained model outperforms single model TGNNs on thirteen unseen test networks. Therefore, we believe that this is a promising first step towards building foundation models for temporal graphs.

2024-06-14

ArXiv (preprint)

doi.org

arxiv.org

AI-Assisted Generation of Difficult Math Questions

Vedant Shah

Dingli Yu

Kaifeng Lyu

Simon Park

Nan Rosemary Ke

Michael Curtis Mozer

James Lloyd McClelland

Yoshua Bengio

Sanjeev Arora

Anirudh Goyal

Current LLM training positions mathematical reasoning as a core capability. With publicly available sources fully tapped, there is unmet dem… (see more)and for diverse and challenging math questions. Relying solely on human experts is both time-consuming and costly, while LLM-generated questions often lack the requisite diversity and difficulty. We present a design framework that combines the strengths of LLMs with a human-in-the-loop approach to generate a diverse array of challenging math questions. We leverage LLM metacognition skills [Didolkar et al., 2024] of a strong LLM to extract core"skills"from existing math datasets. These skills serve as the basis for generating novel and difficult questions by prompting the LLM with random pairs of core skills. The use of two different skills within each question makes finding such questions an"out of distribution"task for both LLMs and humans. Our pipeline employs LLMs to iteratively generate and refine questions and solutions through multiturn prompting. Human annotators then verify and further refine the questions, with their efficiency enhanced via further LLM interactions. Applying this pipeline on skills extracted from the MATH dataset [Hendrycks et al., 2021] resulted in MATH

2024-06-13

ICML.cc/2024/Workshop/AI4MATH (poster)

doi.org

openreview.net

Are we making progress in unlearning? Findings from the first NeurIPS unlearning competition

Eleni Triantafillou

Peter Kairouz

Fabian Pedregosa

Jamie Hayes

Meghdad Kurmanji

Kairan Zhao

Vincent Dumoulin

Julio C. S. Jacques Junior

Ioannis Mitliagkas

Jun Wan

Lisheng Sun-Hosoya

Sergio Escalera

Gintare Karolina Dziugaite

Peter Triantafillou

Isabelle Guyon

We present the findings of the first NeurIPS competition on unlearning, which sought to stimulate the development of novel algorithms and in… (see more)itiate discussions on formal and robust evaluation methodologies. The competition was highly successful: nearly 1,200 teams from across the world participated, and a wealth of novel, imaginative solutions with different characteristics were contributed. In this paper, we analyze top solutions and delve into discussions on benchmarking unlearning, which itself is a research problem. The evaluation methodology we developed for the competition measures forgetting quality according to a formal notion of unlearning, while incorporating model utility for a holistic evaluation. We analyze the effectiveness of different instantiations of this evaluation framework vis-a-vis the associated compute cost, and discuss implications for standardizing evaluation. We find that the ranking of leading methods remains stable under several variations of this framework, pointing to avenues for reducing the cost of evaluation. Overall, our findings indicate progress in unlearning, with top-performing competition entries surpassing existing algorithms under our evaluation framework. We analyze trade-offs made by different algorithms and strengths or weaknesses in terms of generalizability to new datasets, paving the way for advancing both benchmarking and algorithm development in this important area.

2024-06-13

ArXiv (preprint)

doi.org

arxiv.org

ECBD: Evidence-Centered Benchmark Design for NLP

Yu Lu Liu

Su Lin Blodgett

Jackie Chi

Kit Cheung

Q. V. Liao

Alexandra Olteanu

Ziang Xiao

Benchmarking is seen as critical to assessing progress in NLP. However, creating a benchmark involves many design decisions (e.g., which dat… (see more)asets to include, which metrics to use) that often rely on tacit, untested assumptions about what the benchmark is intended to measure or is actually measuring. There is currently no principled way of analyzing these decisions and how they impact the validity of the benchmark's measurements. To address this gap, we draw on evidence-centered design in educational assessments and propose Evidence-Centered Benchmark Design (ECBD), a framework which formalizes the benchmark design process into five modules. ECBD specifies the role each module plays in helping practitioners collect evidence about capabilities of interest. Specifically, each module requires benchmark designers to describe, justify, and support benchmark design choices -- e.g., clearly specifying the capabilities the benchmark aims to measure or how evidence about those capabilities is collected from model responses. To demonstrate the use of ECBD, we conduct case studies with three benchmarks: BoolQ, SuperGLUE, and HELM. Our analysis reveals common trends in benchmark design and documentation that could threaten the validity of benchmarks' measurements.

2024-06-13

ArXiv (preprint)

doi.org

arxiv.org

Exploring validation metrics for offline model-based optimisation with diffusion models

Christopher Beckham

Alexandre Piché

David Vazquez

Chris Pal

2024-06-13

TMLR (accepted)

openreview.net