Publications

NEURAL NETWORK-BASED SOLVERS FOR PDES

M. Cameron

Ian G Goodfellow

(1) N (x; θ) = Ll+1 ○ σl ○Ll ○ σl−1 ○ . . . ○ σ1 ○L1. The symbol Lk denotes the k’s affine operator of the form Lk(x) = … (voir plus)Akx + bk, while σk denotes a nonlinear function called an activation function. The activation functions are chosen by the user. The matrices Ak and shift vectors (or bias vectors) bk are encoded into the argument θ: θ = {Ak, bk} l+1 k=1. The term training neural network means finding {Ak, bk} l+1 k=1 such that N (x; θ) satisfies certain conditions. These conditions are described by the loss function chosen by the user. For example, one might want the neural network to assume certain values fj at certain points xj , j = 1, . . . ,N . These points x are called the training data. In this case, a common choice of the loss function is the least squares error:

2022-12-31

(publié)

www.semanticscholar.org

Noisy Pairing and Partial Supervision for Stylized Opinion Summarization

Reinald Kim

Mirella Lapata. 2020

Un-611

David M. Krueger

Maxinder S. Kan-620

Somnath Basu

Roy Chowdhury

Chao Zhao

Tanya Goyal

Junyi Jiacheng Xu

Jessy Li

Ivor W. Tsang

James T. Kwok

Neil Houlsby

Andrei Giurgiu

Stanisław Jastrzębski … (voir 22 de plus)

Bruna Morrone

Quentin de Laroussilhe

Mona Gesmundo

Attariyan Sylvain

Gelly

Thomas Wolf

Lysandre Debut

Julien Victor Sanh

Clement Chaumond

Anthony Delangue

Pier-339 Moi

Tim ric Cistac

R´emi Rault

Morgan Louf

Funtow-900 Joe

Sam Davison

Patrick Shleifer

Von Platen

Clara Ma

Yacine Jernite

Julien Plu

Canwen Xu

Opinion summarization research has primar-001 ily focused on generating summaries reflect-002 ing important opinions from customer reviews 0… (voir plus)03 without paying much attention to the writing 004 style. In this paper, we propose the stylized 005 opinion summarization task, which aims to 006 generate a summary of customer reviews in 007 the desired (e.g., professional) writing style. 008 To tackle the difficulty in collecting customer 009 and professional review pairs, we develop a 010 non-parallel training framework, Noisy Pair-011 ing and Partial Supervision ( NAPA ), which 012 trains a stylized opinion summarization sys-013 tem from non-parallel customer and profes-014 sional review sets. We create a benchmark P RO - 015 S UM by collecting customer and professional 016 reviews from Yelp and Michelin. Experimental 017 results on P RO S UM and FewSum demonstrate 018 that our non-parallel training framework con-019 sistently improves both automatic and human 020 evaluations, successfully building a stylized 021 opinion summarization model that can gener-022 ate professionally-written summaries from cus-023 tomer reviews. 024

2022-12-31

(publié)

www.semanticscholar.org

Normalization Layers Are All That Sharpness-Aware Minimization Needs

Maximilian Müller

Tiffany Vlaar

David Rolnick

Matthias Hein

Sharpness-aware minimization (SAM) was proposed to reduce sharpness of minima and has been shown to enhance generalization performance in va… (voir plus)rious settings. In this work we show that perturbing only the affine normalization parameters (typically comprising 0.1% of the total parameters) in the adversarial step of SAM can outperform perturbing all of the parameters.This finding generalizes to different SAM variants and both ResNet (Batch Normalization) and Vision Transformer (Layer Normalization) architectures. We consider alternative sparse perturbation approaches and find that these do not achieve similar performance enhancement at such extreme sparsity levels, showing that this behaviour is unique to the normalization layers. Although our findings reaffirm the effectiveness of SAM in improving generalization performance, they cast doubt on whether this is solely caused by reduced sharpness.

2022-12-31

Advances in Neural Information Processing Systems 36 (NeurIPS 2023) (publié)

doi.org

openreview.net

A Novel Deep Multi-head Attentive Vulnerable Line Detector

Miles Q. Li

Benjamin C. M. Fung

Ashita Diwan

2022-12-31

Procedia Computer Science (publié)

doi.org

An Online Newton’s Method for Time-Varying Linear Equality Constraints

Jean-Luc Lupien

Antoine Lesage-Landry

We consider online optimization problems with time-varying linear equality constraints. In this framework, an agent makes sequential decisio… (voir plus)ns using only prior information. At every round, the agent suffers an environment-determined loss and must satisfy time-varying constraints. Both the loss functions and the constraints can be chosen adversarially. We propose the Online Projected Equality-constrained Newton Method (OPEN-M) to tackle this family of problems. We obtain sublinear dynamic regret and constraint violation bounds for OPEN-M under mild conditions. Namely, smoothness of the loss function and boundedness of the inverse Hessian at the optimum are required, but not convexity. Finally, we show OPEN-M outperforms state-of-the-art online constrained optimization algorithms in a numerical network flow application.

2022-12-31

IEEE Control Systems Letters (publié)

doi.org

arxiv.org

Optimal Transport for Unsupervised Hallucination Detection in Neural Machine Translation

Nuno M. Guerreiro

Pierre Colombo

Pablo Piantanida

André Martins

Neural machine translation (NMT) has become the de-facto standard in real-world machine translation applications. However, NMT models can un… (voir plus)predictably produce severely pathological translations, known as hallucinations, that seriously undermine user trust. It becomes thus crucial to implement effective preventive strategies to guarantee their proper functioning. In this paper, we address the problem of hallucination detection in NMT by following a simple intuition: as hallucinations are detached from the source content, they exhibit encoder-decoder attention patterns that are statistically different from those of good quality translations. We frame this problem with an optimal transport formulation and propose a fully unsupervised, plug-in detector that can be used with any attention-based NMT model. Experimental results show that our detector not only outperforms all previous model-based detectors, but is also competitive with detectors that employ external models trained on millions of samples for related tasks such as quality estimation and cross-lingual sentence similarity.

2022-12-31

ACL (1) (publié)

doi.org

arxiv.org

Optimising Electric Vehicle Charging Station Placement using Advanced Discrete Choice Models

Steven Lamontagne

Margarida Carvalho

Emma Frejinger

Bernard Gendron

Miguel F. Anjos

Ribal Atallah

D'epartement d'informatique et de recherche op'erationnelle

U. Montr'eal

S. O. Mathematics

U. Edinburgh

Institut de Recherche d'Hydro-Qu'ebec

We present a new model for finding the optimal placement of electric vehicle charging stations across a multi-period time frame so as to max… (voir plus)imise electric vehicle adoption. Via the use of advanced discrete choice models and user classes, this work allows for a granular modelling of user attributes and their preferences in regard to charging station characteristics. Instead of embedding an analytical probability model in the formulation, we adopt a simulation approach and pre-compute error terms for each option available to users for a given number of scenarios. This results in a bilevel optimisation model that is, however, intractable for all but the simplest instances. Using the pre-computed error terms to calculate the users covered by each charging station allows for a maximum covering model, for which solutions can be found more efficiently than for the bilevel formulation. The maximum covering formulation remains intractable in some instances, so we propose rolling horizon, greedy, and GRASP heuristics to obtain good quality solutions more efficiently. Extensive computational results are provided, which compare the maximum covering formulation with the current state-of-the-art, both for exact solutions and the heuristic methods. Keywords: Electric vehicle charging stations, facility location, integer programming, discrete choice models, maximum covering

2022-12-31

INFORMS J. Comput. (publié)

doi.org

arxiv.org

Optimism and Adaptivity in Policy Optimization

Veronica Chelu

Tom Zahavy

Arthur Guez

Doina Precup

Sebastian Flennerhag

2022-12-31

arXiv.org (prépublication)

doi.org

Optimizing Fairness over Time with Homogeneous Workers (Short Paper).

Bart-jan Van Rossum

Ying Chen

Andrea Lodi

2022-12-31

ATMOS (publié)

doi.org

Party Prediction for Twitter

Sacha Lévy

Gabrielle Desrosiers-Brisebois

Aarash Feizi

Cécile Amadoro

Andre Blais

Jean-François Godbout

Reihaneh Rabbany

A large number of studies on social media compare the behaviour of users from different political parties. As a basic step, they employ a pr… (voir plus)edictive model for inferring their political affiliation. The accuracy of this model can change the conclusions of a downstream analysis significantly, yet the choice between different models seems to be made arbitrarily. In this paper, we provide a comprehensive survey and an empirical comparison of the current party prediction practices and propose several new approaches which are competitive with or outperform state-of-the-art methods, yet require less computational resources. Party prediction models rely on the content generated by the users (e.g., tweet texts), the relations they have (e.g., who they follow), or their activities and interactions (e.g., which tweets they like). We examine all of these and compare their signal strength for the party prediction task. This paper lets the practitioner select from a wide range of data types that all give strong performance. Finally, we conduct extensive experiments on different aspects of these methods, such as data collection speed and transfer capabilities, which can provide further insights for both applied and methodological research.

2022-12-31

arXiv (prépublication)

doi.org

arxiv.org

Patient experience or patient satisfaction? A systematic review of child- and family-reported experience measures in pediatric surgery.

Julia Ferreira

Prachikumari Patel

Elena Guadagno

Nikki Ow

Jo Wray

Sherif Emil

Dan Poenaru

2022-12-31

Journal of Pediatric Surgery (publié)

doi.org

Performative Prediction with Neural Networks

Mehrnaz Mofakhami

Ioannis Mitliagkas

Gauthier Gidel

Performative prediction is a framework for learning models that influence the data they intend to predict. We focus on finding classifiers t… (voir plus)hat are performatively stable, i.e. optimal for the data distribution they induce. Standard convergence results for finding a performatively stable classifier with the method of repeated risk minimization assume that the data distribution is Lipschitz continuous to the model's parameters. Under this assumption, the loss must be strongly convex and smooth in these parameters; otherwise, the method will diverge for some problems. In this work, we instead assume that the data distribution is Lipschitz continuous with respect to the model's predictions, a more natural assumption for performative systems. As a result, we are able to significantly relax the assumptions on the loss function. In particular, we do not need to assume convexity with respect to the model's parameters. As an illustration, we introduce a resampling procedure that models realistic distribution shifts and show that it satisfies our assumptions. We support our theory by showing that one can learn performatively stable classifiers with neural networks making predictions about real data that shift according to our proposed procedure.

2022-12-31

AISTATS (publié)

doi.org

proceedings.mlr.press

Mila sur Udemy

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Publications

Mila sur Udemy

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Mots-clés populaires:

Publications