Publications

Fairness in Kidney Exchange Programs through Optimal Solutions Enumeration

Behrouz Babaki

Not all patients who need kidney transplant can find a donor with compatible characteristics. Kidney exchange programs (KEPs) seek to match … (see more)such incompatible patient-donor pairs together, usually with the objective of maximizing the total number of transplants. We propose a randomized policy for selecting an optimal solution in which patients’ equity of opportunity to receive a transplant is promoted. Our approach gives rise to the problem of enumerating all optimal solutions, which we tackle using a hybrid of constraint programming and linear programming. We empirically demonstrate the advantages of our proposed method over the common practice of using the first optimal solution obtained by a solver.

2020-01-01

(published)

www.semanticscholar.org

Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation

Si Yi Meng

Sharan Vaswani

Issam Hadj Laradji

Mark Schmidt

Simon Lacoste-Julien

We consider stochastic second-order methods for minimizing smooth and strongly-convex functions under an interpolation condition satisfied b… (see more)y over-parameterized models. Under this condition, we show that the regularized subsampled Newton method (R-SSN) achieves global linear convergence with an adaptive step-size and a constant batch-size. By growing the batch size for both the subsampled gradient and Hessian, we show that R-SSN can converge at a quadratic rate in a local neighbourhood of the solution. We also show that R-SSN attains local linear convergence for the family of self-concordant functions. Furthermore, we analyze stochastic BFGS algorithms in the interpolation setting and prove their global linear convergence. We empirically evaluate stochastic L-BFGS and a "Hessian-free" implementation of R-SSN for binary classification on synthetic, linearly-separable datasets and real datasets under a kernel mapping. Our experimental results demonstrate the fast convergence of these methods, both in terms of the number of iterations and wall-clock time.

2020-01-01

AISTATS (published)

proceedings.mlr.press

arxiv.org

GAIT: A Geometric Approach to Information Theory

Jose Gallego-Posada

Ankit Vani

Max Schwarzer

Simon Lacoste-Julien

We advocate the use of a notion of entropy that reflects the relative abundances of the symbols in an alphabet, as well as the similarities … (see more)between them. This concept was originally introduced in theoretical ecology to study the diversity of ecosystems. Based on this notion of entropy, we introduce geometry-aware counterparts for several concepts and theorems in information theory. Notably, our proposed divergence exhibits performance on par with state-of-the-art methods based on the Wasserstein distance, but enjoys a closed-form expression that can be computed efficiently. We demonstrate the versatility of our method via experiments on a broad range of domains: training generative models, computing image barycenters, approximating empirical measures and counting modes.

2020-01-01

AISTATS (published)

proceedings.mlr.press

arxiv.org

How to make your optimizer generalize better

Sharan Vaswani

Reza Babenzhad

Sait AI Lab

Montreal.

Jose Gallego

Aaron Mishkin

Simon Lacoste-Julien

Nicolas Le Roux

We study the implicit regularization of optimization methods for linear models interpolating the training data in the under-parametrized and… (see more) over-parametrized regimes. For over-parameterized linear regression, where there are inﬁnitely many interpolating solutions, different optimization methods can converge to solutions with varying generalization performance. In this setting, we show that projections onto linear spans can be used to move between solutions. Furthermore, via a simple reparameterization, we can ensure that an arbitrary optimizer converges to the minimum (cid:96) 2 -norm solution with favourable generalization properties. For under-parameterized linear clas-siﬁcation, optimizers can converge to different decision boundaries separating the data. We prove that for any such classiﬁer, there exists a family of quadratic norms (cid:107)·(cid:107) P such that the classiﬁer’s direction is the same as that of the maximum P -margin solution. We argue that analyzing convergence to the standard maximum (cid:96) 2 -margin is arbitrary and show that minimizing the norm induced by the data can result in better generalization. We validate our theoretical results via experiments on synthetic and real datasets.

2020-01-01

(published)

www.semanticscholar.org

Intelligent Tools for Precision Public Health.

Anya Okhmatovskaia

David Buckeridge

2020-01-01

MIE (published)

doi.org

Investigating the Barriers to Physician Adoption of an Artificial Intelligence- Based Decision Support System in Emergency Care: An Interpretative Qualitative Study.

Cécile Petitgand

Aude Motulsky

Jean-Louis Denis

Catherine Régis

2020-01-01

MIE (published)

doi.org

Investigating the Influence of Selected Linguistic Features on Authorship Attribution using German News Articles

Manuel Sage

Pietro Cruciata

Raed Abdo

Jackie Cheung

Yaoyao Fiona Zhao

In this work, we perform authorship attri-bution on a new dataset of German news articles. We seek to classify over 3,700 articles to their … (see more)ﬁve corresponding authors, using four conventional machine learning approaches (na¨ıve Bayes, logistic regression, SVM and kNN) and a convolutional neural network. We analyze the effect of character and word n-grams on the prediction accuracy, as well as the inﬂuence of stop words, punctuation, numbers, and lowercasing when preprocessing raw text. The experiments show that higher order character n-grams (n = 5,6) perform better than lower orders and word n-grams slightly outperform those with characters. Combining both in fusion models further improves results up to 92% for SVM. A multilayer convolutional structure allows the CNN to achieve 90.5% accuracy. We found stop words and punctuation to be important features for author identiﬁcation; removing them leads to a measurable decrease in performance. Finally, we evaluate the topic dependency of the algorithms by gradually replacing named entities, nouns, verbs and eventually all to-kens in the dataset according to their POS-tags.

2020-01-01

SwissText/KONVENS (published)

dblp.uni-trier.de

Investigating the interconnections between human, technology and context in the implementation of a AI-based health information technology: a dynamic technological frame perspective

Catherine Régis

2020-01-01

(published)

www.semanticscholar.org

Language GANs Falling Short

Massimo Caccia

Lucas Caccia

William Fedus

Hugo Larochelle

Joelle Pineau

Laurent Charlin

Generating high-quality text with sufficient diversity is essential for a wide range of Natural Language Generation (NLG) tasks. Maximum-Lik… (see more)elihood (MLE) models trained with teacher forcing have consistently been reported as weak baselines, where poor performance is attributed to exposure bias (Bengio et al., 2015; Ranzato et al., 2015); at inference time, the model is fed its own prediction instead of a ground-truth token, which can lead to accumulating errors and poor samples. This line of reasoning has led to an outbreak of adversarial based approaches for NLG, on the account that GANs do not suffer from exposure bias. In this work, we make several surprising observations which contradict common beliefs. First, we revisit the canonical evaluation framework for NLG, and point out fundamental flaws with quality-only evaluation: we show that one can outperform such metrics using a simple, well-known temperature parameter to artificially reduce the entropy of the model's conditional distributions. Second, we leverage the control over the quality / diversity trade-off given by this parameter to evaluate models over the whole quality-diversity spectrum and find MLE models constantly outperform the proposed GAN variants over the whole quality-diversity space. Our results have several implications: 1) The impact of exposure bias on sample quality is less severe than previously thought, 2) temperature tuning provides a better quality / diversity trade-off than adversarial training while being easier to train, easier to cross-validate, and less computationally expensive. Code to reproduce the experiments is available at github.com/pclucas14/GansFallingShort

2020-01-01

ICLR (published)

openreview.net

Learning Graph Structure With A Finite-State Automaton Layer

Daniel D. Johnson

Hugo Larochelle

Danny Tarlow

arxiv.org

Measuring Systematic Generalization in Neural Proof Generation with Transformers

Nicolas Gontier

Koustuv Sinha

Siva Reddy

Chris Pal

We are interested in understanding how well Transformer language models (TLMs) can perform reasoning tasks when trained on knowledge encoded… (see more) in the form of natural language. We investigate their systematic generalization abilities on a logical reasoning task in natural language, which involves reasoning over relationships between entities grounded in first-order logical proofs. Specifically, we perform soft theorem-proving by leveraging TLMs to generate natural language proofs. We test the generated proofs for logical consistency, along with the accuracy of the final inference. We observe length-generalization issues when evaluated on longer-than-trained sequences. However, we observe TLMs improve their generalization performance after being exposed to longer, exhaustive proofs. In addition, we discover that TLMs are able to generalize better using backward-chaining proofs compared to their forward-chaining counterparts, while they find it easier to generate forward chaining proofs. We observe that models that are not trained to generate proofs are better at generalizing to problems based on longer proofs. This suggests that Transformers have efficient internal reasoning strategies that are harder to interpret. These results highlight the systematic generalization behavior of TLMs in the context of logical reasoning, and we believe this work motivates deeper inspection of their underlying reasoning strategies.

arxiv.org

Models of Human Behavioral Agents in Bandits, Contextual Bandits and RL

Baihan Lin

Guillermo Cecchi

Djallel Bouneffouf

Jenna Reinen

Irina Rish

2020-01-01

HBAI@IJCAI (published)

doi.org

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Publications

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Popular keywords:

Publications