Publications

Comparing GPT-4 and Open-Source Language Models in Misinformation Mitigation

Tyler Vergho

Recent large language models (LLMs) have been shown to be effective for misinformation detection. However, the choice of LLMs for experiment… (voir plus)s varies widely, leading to uncertain conclusions. In particular, GPT-4 is known to be strong in this domain, but it is closed source, potentially expensive, and can show instability between different versions. Meanwhile, alternative LLMs have given mixed results. In this work, we show that Zephyr-7b presents a consistently viable alternative, overcoming key limitations of commonly used approaches like Llama-2 and GPT-3.5. This provides the research community with a solid open-source option and shows open-source models are gradually catching up on this task. We then highlight how GPT-3.5 exhibits unstable performance, such that this very widely used model could provide misleading results in misinformation detection. Finally, we validate new tools including approaches to structured output and the latest version of GPT-4 (Turbo), showing they do not compromise performance, thus unlocking them for future research and potentially enabling more complex pipelines for misinformation mitigation.

2024-01-12

ArXiv (prépublication)

doi.org

arxiv.org

Comparing GPT-4 and Open-Source Language Models in Misinformation Mitigation

Tyler Vergho

Jean-François Godbout

Reihaneh Rabbany

Kellin Pelrine

Recent large language models (LLMs) have been shown to be effective for misinformation detection. However, the choice of LLMs for experiment… (voir plus)s varies widely, leading to uncertain conclusions. In particular, GPT-4 is known to be strong in this domain, but it is closed source, potentially expensive, and can show instability between different versions. Meanwhile, alternative LLMs have given mixed results. In this work, we show that Zephyr-7b presents a consistently viable alternative, overcoming key limitations of commonly used approaches like Llama-2 and GPT-3.5. This provides the research community with a solid open-source option and shows open-source models are gradually catching up on this task. We then highlight how GPT-3.5 exhibits unstable performance, such that this very widely used model could provide misleading results in misinformation detection. Finally, we validate new tools including approaches to structured output and the latest version of GPT-4 (Turbo), showing they do not compromise performance, thus unlocking them for future research and potentially enabling more complex pipelines for misinformation mitigation.

2024-01-12

ArXiv (prépublication)

doi.org

arxiv.org

Laplacian Change Point Detection for Single and Multi-view Dynamic Graphs

Samy Coulombe

Dynamic graphs are rich data structures that are used to model complex relationships between entities over time. In particular, anomaly dete… (voir plus)ction in temporal graphs is crucial for many real-world applications such as intrusion identification in network systems, detection of ecosystem disturbances, and detection of epidemic outbreaks. In this article, we focus on change point detection in dynamic graphs and address three main challenges associated with this problem: (i) how to compare graph snapshots across time, (ii) how to capture temporal dependencies, and (iii) how to combine different views of a temporal graph. To solve the above challenges, we first propose Laplacian Anomaly Detection (LAD) which uses the spectrum of graph Laplacian as the low dimensional embedding of the graph structure at each snapshot. LAD explicitly models short-term and long-term dependencies by applying two sliding windows. Next, we propose MultiLAD, a simple and effective generalization of LAD to multi-view graphs. MultiLAD provides the first change point detection method for multi-view dynamic graphs. It aggregates the singular values of the normalized graph Laplacian from different views through the scalar power mean operation. Through extensive synthetic experiments, we show that (i) LAD and MultiLAD are accurate and outperforms state-of-the-art baselines and their multi-view extensions by a large margin, (ii) MultiLAD’s advantage over contenders significantly increases when additional views are available, and (iii) MultiLAD is highly robust to noise from individual views. In five real-world dynamic graphs, we demonstrate that LAD and MultiLAD identify significant events as top anomalies such as the implementation of government COVID-19 interventions which impacted the population mobility in multi-view traffic networks.

2024-01-12

ACM Transactions on Knowledge Discovery from Data (publié)

doi.org

arxiv.org

Personalized inference for neurostimulation with meta-learning: a case study of vagus nerve stimulation

Ximeng Mao

Yao-Chuan Chang

Stavros Zanos

Guillaume Lajoie

2024-01-12

Journal of Neural Engineering (publié)

doi.org

DINOv2: Learning Robust Visual Features without Supervision

Maxime Oquab

Timothée Darcet

Théo Moutakanni

Huy V. Vo

Marc Szafraniec

Vasil Khalidov

Pierre Fernandez

Daniel HAZIZA

Francisco Massa

Alaaeldin El-Nouby

Mahmoud Assran

Nicolas Ballas

Wojciech Galuba

Russell Howes

Po-Yao Huang

Shang-Wen Li

Ishan Misra

Michael Rabbat

Vasu Sharma

Gabriel Synnaeve … (voir 8 de plus)

Hu Xu 0001

Hu Xu

Huijiao Xu

Herve Jegou

Julien Mairal

Patrick Labatut

Armand Joulin

Piotr Bojanowski

The recent breakthroughs in natural language processing for model pretraining on large quantities of data have opened the way for similar fo… (voir plus)undation models in computer vision. These models could greatly simplify the use of images in any system by producing all-purpose visual features, i.e., features that work across image distributions and tasks without finetuning. This work shows that existing pretraining methods, especially self-supervised methods, can produce such features if trained on enough curated data from diverse sources. We revisit existing approaches and combine different techniques to scale our pretraining in terms of data and model size. Most of the technical contributions aim at accelerating and stabilizing the training at scale. In terms of data, we propose an automatic pipeline to build a dedicated, diverse, and curated image dataset instead of uncurated data, as typically done in the self-supervised literature. In terms of models, we train a ViT model with 1B parameters and distill it into a series of smaller models that surpass the best available all-purpose features, OpenCLIP on most of the benchmarks at image and pixel levels.

2024-01-11

TMLR (accepté)

doi.org

openreview.net

Signatures of Co-evolution and Co-regulation in the CYP3A and CYP4F Genes in Humans

Alex Richard-St-Hilaire

Isabel Gamache

Justin Pelletier

Jean-Christophe Grenier

Raphael Poujol

Julie Hussin

2024-01-11

Genome Biology and Evolution (publié)

doi.org

Nonparametric Partial Disentanglement via Mechanism Sparsity: Sparse Actions, Interventions and Sparse Temporal Dependencies

Sébastien Lachapelle

Pau Rodriguez

Yash Sharma

Katie Everett

Rémi LE PRIOL

Alexandre Lacoste

Simon Lacoste-Julien

2024-01-10

ArXiv (prépublication)

doi.org

arxiv.org

The Past, Present, and Future of the Brain Imaging Data Structure (BIDS)

Russell A. Poldrack

Christopher J. Markiewicz

Stefan Appelhoff

Yoni K. Ashar

Tibor Auer

Sylvain Baillet

Shashank Bansal

Leandro Beltrachini

Christian G. Benar

C. Bénar

Giacomo Bertazzoli

Suyash Bhogawar

Ross W. Blair

Marta Bortoletto

Mathieu Boudreau

Teon L. Brooks

Vince D. Calhoun

Filippo Maria Castelli

Patricia Clement

Alexander L. Cohen … (voir 100 de plus)

Julien Cohen-Adad

Sasha D’Ambrosio

Gilles de Hollander

María de la Iglesia-Vayá

Alejandro de la Vega

Arnaud Delorme

Orrin Devinsky

Dejan Draschkow

Eugene Paul Duff

E. Duff

Elizabeth DuPre

Eric Earl

Oscar Esteban

Franklin W. Feingold

Guillaume Flandin

Anthony Galassi

Giuseppe Gallitto

Melanie Ganz

Rémi Gau

James Gholam

Sulagna Dia Ghosh

Satrajit S. Ghosh

Alessio Giacomel

Ashley G. Gillman

Padraig Gleeson

Alexandre Gramfort

Samuel Guay

Giacomo Guidali

Yaroslav O. Halchenko

Daniel A. Handwerker

Nell Hardcastle

Peer Herholz

Dora Hermes

Christopher J. Honey

C. Honey

Robert B. Innis

Horea-Ioan Ioanas

Andrew Jahn

Agah Karakuzu

David B. Keator

Gregory Kiar

Balint Kincses

Angela R. Laird

Jonathan C. Lau

Alberto Lazari

Jon Haitz Legarreta

Adam Li

Xiangrui Li

Bradley C. Love

Hanzhang Lu

Eleonora Marcantoni

Camille Maumet

Giacomo Mazzamuto

Steven L. Meisler

Mark Mikkelsen

Henk Mutsaerts

Thomas E. Nichols

Aki Nikolaidis

Gustav Nilsonne

Guiomar Niso

Martin Norgaard

Thomas W. Okell

Robert Oostenveld

Eduard Ort

Patrick J. Park

Mateusz Pawlik

Cyril R. Pernet

Franco Pestilli

Jan Petr

Christophe Phillips

Jean-Baptiste Poline

Luca Pollonini

Pradeep R. Raamana

Pradeep Reddy Raamana

Petra Ritter

Gaia Rizzo

Kay A. Robbins

Alexander P. Rockhill

Christine Rogers

Ariel Rokem

Chris Rorden

Alexandre Routier

Jose Manuel Saborit-Torres

Taylor Salo

Michael Schirner

Robert E. Smith

Tamas Spisak

Julia Sprenger

Nicole C. Swann

Martin Szinte

Sylvain Takerkart

Bertrand Thirion

Adam G. Thomas

Sajjad Torabian

Gael Varoquaux

Bradley Voytek

Julius Welzel

Martin Wilson

Tal Yarkoni

Krzysztof J. Gorgolewski

2024-01-09

ArXiv (prépublication)

doi.org

arxiv.org

DyG2Vec: Efficient Representation Learning for Dynamic Graphs

Mohammad Alomrani

Mahdi Biparva

Yingxue Zhang

Mark Coates

Temporal graph neural networks have shown promising results in learning inductive representations by automatically extracting temporal patte… (voir plus)rns. However, previous works often rely on complex memory modules or inefficient random walk methods to construct temporal representations. To address these limitations, we present an efficient yet effective attention-based encoder that leverages temporal edge encodings and window-based subgraph sampling to generate task-agnostic embeddings. Moreover, we propose a joint-embedding architecture using non-contrastive SSL to learn rich temporal embeddings without labels. Experimental results on 7 benchmark datasets indicate that on average, our model outperforms SoTA baselines on the future link prediction task by 4.23% for the transductive setting and 3.30% for the inductive setting while only requiring 5-10x less training/inference time. Lastly, different aspects of the proposed framework are investigated through experimental analysis and ablation studies. The code is publicly available at https://github.com/huawei-noah/noah-research/tree/master/graph_atlas.

2024-01-08

TMLR (accepté)

openreview.net

CO emission predictions in municipal solid waste incineration based on reduced depth features and long short-term memory optimization

Runyu Zhang

Jian Tang

Heng Xia

Xiaotong Pan

Wen Yu

JunFei Qiao

2024-01-08

Neural computing & applications (Print) (publié)

doi.org

JaxPruner: A concise library for sparsity research

Joo Hyung Lee

Wonpyo Park

Nicole Elyse Mitchell

Jonathan Pilault

Johan Samir Obando Ceron

Han-Byul Kim

Namhoon Lee

Elias Frantar

Yun Long

Amir Yazdanbakhsh

Shivani Agrawal

Suvinay Subramanian

Xin Wang

Sheng-Chun Kao

Xingyao Zhang

Trevor Gale

Aart J.C. Bik

Woohyun Han

Milen Ferev

Zhonglin Han … (voir 5 de plus)

Hong-Seok Kim

Yann Dauphin

Gintare Karolina Dziugaite

Pablo Samuel Castro

Utku Evci

This paper introduces JaxPruner, an open-source JAX-based pruning and sparse training library for machine learning research. JaxPruner aims … (voir plus)to accelerate research on sparse neural networks by providing concise implementations of popular pruning and sparse training algorithms with minimal memory and latency overhead. Algorithms implemented in JaxPruner use a common API and work seamlessly with the popular optimization library Optax, which, in turn, enables easy integration with existing JAX based libraries. We demonstrate this ease of integration by providing examples in four different codebases: Scenic, t5x, Dopamine and FedJAX and provide baseline experiments on popular benchmarks.

2024-01-08

Conference on Parsimony and Learning (publié)

doi.org

openreview.net

GABAergic inhibition shapes behavior and neural dynamics in human visual working memory

Jan Kujala

Carolina Ciumas

Julien Jung

Sandrine Bouvard

Françoise Lecaignard

Amélie Lothe

Romain Bouet

Philippe Ryvlin

Karim Jerbi

Abstract Neuronal inhibition, primarily mediated by GABAergic neurotransmission, is crucial for brain development and healthy cognition. Gam… (voir plus)ma-aminobutyric acid concentration levels in sensory areas have been shown to correlate with hemodynamic and oscillatory neuronal responses. How these measures relate to one another during working memory, a higher-order cognitive process, is still poorly understood. We address this gap by collecting magnetoencephalography, functional magnetic resonance imaging, and Flumazenil positron emission tomography data within the same subject cohort using an n-back working-memory paradigm. By probing the relationship between GABAA receptor distribution, neural oscillations, and Blood Oxygen Level Dependent (BOLD) modulations, we found that GABAA receptor density in higher-order cortical areas predicted the reaction times on the working-memory task and correlated positively with the peak frequency of gamma power modulations and negatively with BOLD amplitude. These findings support and extend theories linking gamma oscillations and hemodynamic responses to gamma-aminobutyric acid neurotransmission and to the excitation-inhibition balance and cognitive performance in humans. Considering the small sample size of the study, future studies should test whether these findings also hold for other, larger cohorts as well as to examine in detail how the GABAergic system and neural fluctuations jointly support working-memory task performance.

2024-01-06

Cerebral Cortex (publié)

doi.org

Science éclair

À l’avant-garde d’une nouvelle ère

Demandes de supervision

Publications

Science éclair

À l’avant-garde d’une nouvelle ère

Demandes de supervision

Mots-clés populaires:

Publications