Publications

Optimal Transport for Unsupervised Hallucination Detection in Neural Machine Translation

Nuno Miguel Guerreiro

Pierre Colombo

André Martins

Neural machine translation (NMT) has become the de-facto standard in real-world machine translation applications. However, NMT models can un… (voir plus)predictably produce severely pathological translations, known as hallucinations, that seriously undermine user trust. It becomes thus crucial to implement effective preventive strategies to guarantee their proper functioning. In this paper, we address the problem of hallucination detection in NMT by following a simple intuition: as hallucinations are detached from the source content, they exhibit encoder-decoder attention patterns that are statistically different from those of good quality translations. We frame this problem with an optimal transport formulation and propose a fully unsupervised, plug-in detector that can be used with any attention-based NMT model. Experimental results show that our detector not only outperforms all previous model-based detectors, but is also competitive with detectors that employ external models trained on millions of samples for related tasks such as quality estimation and cross-lingual sentence similarity.

2022-12-19

ArXiv (prépublication)

doi.org

arxiv.org

The KITMUS Test: Evaluating Knowledge Integration from Multiple Sources

Akshatha Arodi

Martin Pömsl

Kaheer Suleman

Adam Trischler

Alexandra Olteanu

Jackie Cheung

Many state-of-the-art natural language understanding (NLU) models are based on pretrained neural language models. These models often make in… (voir plus)ferences using information from multiple sources. An important class of such inferences are those that require both background knowledge, presumably contained in a model’s pretrained parameters, and instance-specific information that is supplied at inference time. However, the integration and reasoning abilities of NLU models in the presence of multiple knowledge sources have been largely understudied. In this work, we propose a test suite of coreference resolution subtasks that require reasoning over multiple facts. These subtasks differ in terms of which knowledge sources contain the relevant facts. We also introduce subtasks where knowledge is present only at inference time using fictional knowledge. We evaluate state-of-the-art coreference resolution models on our dataset. Our results indicate that several models struggle to reason on-the-fly over knowledge observed both at pretrain time and at inference time. However, with task-specific training, a subset of models demonstrates the ability to integrate certain knowledge types from multiple sources. Still, even the best performing models seem to have difficulties with reliably integrating knowledge presented only at inference time.

2022-12-15

ArXiv (prépublication)

doi.org

arxiv.org

Dynamic Consolidation for Continual Learning

Hang Li

Chen Ma

Xi Chen

Xue (Steve) Liu

Abstract Training deep learning models from a stream of nonstationary data is a critical problem to be solved to achieve general artificial … (voir plus)intelligence. As a promising solution, the continual learning (CL) technique aims to build intelligent systems that have the plasticity to learn from new information without forgetting the previously obtained knowledge. Unfortunately, existing CL methods face two nontrivial limitations. First, when updating a model with new data, existing CL methods usually constrain the model parameters within the vicinity of the parameters optimized for old data, limiting the exploration ability of the model; second, the important strength of each parameter (used to consolidate the previously learned knowledge) is fixed and thus is suboptimal for the dynamic parameter updates. To address these limitations, we first relax the vicinity constraints with a global definition of the important strength, which allows us to explore the full parameter space. Specifically, we define the important strength as the sensitivity of the global loss function to the model parameters. Moreover, we propose adjusting the important strength adaptively to align it with the dynamic parameter updates. Through extensive experiments on popular data sets, we demonstrate that our proposed method outperforms the strong baselines by up to 24% in terms of average accuracy.

2022-12-14

Neural Computation (publié)

doi.org

A Tweedie Compound Poisson Model in Reproducing Kernel Hilbert Space

Yi Lian

Archer Yang

Boxiang Wang

Peng Shi

Robert William Platt

Abstract Tweedie models can be used to analyze nonnegative continuous data with a probability mass at zero. There have been wide application… (voir plus)s in natural science, healthcare research, actuarial science, and other fields. The performance of existing Tweedie models can be limited on today’s complex data problems with challenging characteristics such as nonlinear effects, high-order interactions, high-dimensionality and sparsity. In this article, we propose a kernel Tweedie model, Ktweedie, and its sparse variant, SKtweedie, that can simultaneously address the above challenges. Specifically, nonlinear effects and high-order interactions can be flexibly represented through a wide range of kernel functions, which is fully learned from the data; In addition, while the Ktweedie can handle high-dimensional data, the SKtweedie with integrated variable selection can further improve the interpretability. We perform extensive simulation studies to justify the prediction and variable selection accuracy of our method, and demonstrate the applications in ratemaking and loss-reserving in general insurance. Overall, the Ktweedie and SKtweedie outperform existing Tweedie models when there exist nonlinear effects and high-order interactions, particularly when the dimensionality is high relative to the sample size. The model is implemented in an efficient and user-friendly R package ktweedie (https://cran.r-project.org/package=ktweedie).

2022-12-13

Technometrics (publié)

doi.org

Detection and genomic analysis of BRAF fusions in Juvenile Pilocytic Astrocytoma through the combination and integration of multi-omic data

Melissa Zwaig

Audrey Baguette

Bo Hu

Michael Johnston

Hussein Lakkis

Emily M. Nakada

Damien Faury

Nikoleta Juretic

Benjamin Ellezam

Alexandre G. Weil

Jason Karamchandani

Jacek Majewski

Mathieu Blanchette

Michael D. Taylor

Marco Gallo

Claudia Kleinman

Nada Jabado

Jiannis Ragoussis

2022-12-12

BMC Cancer (publié)

doi.org

Neural Bandits for Data Mining: Searching for Dangerous Polypharmacy

Alexandre Larouche

Audrey Durand

Richard Khoury

Caroline Sirois

2022-12-10

ArXiv (prépublication)

doi.org

arxiv.org

Energy efficiency as a normative account for predictive coding

Shahab Bakhtiari

2022-12-09

Patterns (publié)

doi.org

Implicit Offline Reinforcement Learning via Supervised Learning

Alexandre Piché

Rafael Pardinas

David Vazquez

Igor Mordatch

Chris Pal

Offline Reinforcement Learning (RL) via Supervised Learning is a simple and effective way to learn robotic skills from a dataset of varied b… (voir plus)ehaviors. It is as simple as supervised learning and Behavior Cloning (BC) but takes advantage of the return information. On BC tasks, implicit models have been shown to match or outperform explicit ones. Despite the benefits of using implicit models to learn robotic skills via BC, Offline RL via Supervised Learning algorithms have been limited to explicit models. We show how implicit models leverage return information and match or outperform explicit algorithms to acquire robotic skills from fixed datasets. Furthermore, we show how closely related our implicit methods are to other popular RL via Supervised Learning algorithms.

2022-12-09

NeurIPS.cc/2022/Workshop/DeepRL (inconnu)

doi.org

openreview.net

Informing the development of an outcome set and banks of items to measure mobility among individuals with acquired brain injury using natural language processing

Rehab Alhasani

Mathieu Godbout

Audrey Durand

Claudine Auger

Anouk Lamontagne

Sara Ahmed

2022-12-09

BMC Neurology (publié)

doi.org

PyNM: a Lightweight Python implementation of Normative Modeling

Annabelle Harvey

Guillaume Dumas

The majority of studies in neuroimaging and psychiatry are focussed on case-control analysis (Marquand et al., 2019). However, case-control … (voir plus)relies on well-defined groups which is more the exception than the rule in biology. Psychiatric conditions are diagnosed based on symptoms alone, which makes for heterogeneity at the biological level (Marquand et al., 2016). Relying on mean differences obscures this heterogeneity and the resulting loss of information can produce unreliable results or misleading conclusions (Loth et al., 2021).

2022-12-08

Journal of Open Source Software (publié)

doi.org

Consistency and Rate of Convergence of Switched Least Squares System Identification for Autonomous Markov Jump Linear Systems

Borna Sayedana

Mohammad Afshari

Peter E. Caines

Aditya Mahajan

In this paper, we investigate the problem of system identification for autonomous Markov jump linear systems (MJS) with complete state obser… (voir plus)vations. We propose switched least squares method for identification of MJS, show that this method is strongly consistent, and derive data-dependent and data-independent rates of convergence. In particular, our data-dependent rate of convergence shows that, almost surely, the system identification error is

2022-12-06

IEEE Conference on Decision and Control (publié)

doi.org

A modified Thompson sampling-based learning algorithm for unknown linear systems

Mukul Gagrani

Sagar Sudhakara

Aditya Mahajan

Rahul Jain

Ashutosh Nayyar

Yi Ouyang

We revisit the Thompson sampling-based learning algorithm for controlling an unknown linear system with quadratic cost proposed in [1]. This… (voir plus) algorithm operates in episodes of dynamic length and it is shown to have a regret bound of

2022-12-06

2022 IEEE 61st Conference on Decision and Control (CDC) (publié)

doi.org

arxiv.org

Le traitement du langage naturel à l'ère de l'IA générative

Boussole des politiques en IA

Vie étudiante et ressources

Publications

Le traitement du langage naturel à l'ère de l'IA générative

Boussole des politiques en IA

Vie étudiante et ressources

Mots-clés populaires:

Publications