Publications

Empirical Analysis of Model Selection for Heterogenous Causal Effect Estimation

Brady Neal

Vasilis Syrgkanis

We study the problem of model selection in causal inference, specifically for the case of conditional average treatment effect (CATE) estima… (voir plus)tion under binary treatments. Unlike model selection in machine learning, there is no perfect analogue of cross-validation as we do not observe the counterfactual potential outcome for any data point. Towards this, there have been a variety of proxy metrics proposed in the literature, that depend on auxiliary nuisance models estimated from the observed data (propensity score model, outcome regression model). However, the effectiveness of these metrics has only been studied on synthetic datasets as we can access the counterfactual data for them. We conduct an extensive empirical analysis to judge the performance of these metrics introduced in the literature, and novel ones introduced in this work, where we utilize the latest advances in generative modeling to incorporate multiple realistic datasets. Our analysis suggests novel model selection strategies based on careful hyperparameter tuning of CATE estimators and causal ensembling.

2023-12-31

ICLR (publié)

doi.org

arxiv.org

Enhancing Click-through Rate Prediction in Recommendation Domain with Search Query Representation

Yuening Wang

Man Chen

Yaochen Hu

Wei Guo

Yingxue Zhang

Huifeng Guo

Yang Liu

Mark J. Coates

2023-12-31

CIKM (publié)

doi.org

arxiv.org

Enhancing Security and Energy Efficiency of Cyber-Physical Systems using Deep Reinforcement Learning

Saeid Jamshidi

Ashkan Amirnia

Amin Nikanjam

Foutse Khomh

2023-12-31

Procedia Computer Science (publié)

doi.org

Enhancing Supervised Visualization Through Autoencoder and Random Forest Proximities for Out-of-Sample Extension

Shuang Ni

Adrien Aumon

Guy Wolf

Kevin R. Moon

Jake S. Rhodes

The value of supervised dimensionality reduction lies in its ability to uncover meaningful connections between data features and labels. Com… (voir plus)mon dimensionality reduction methods embed a set of fixed, latent points, but are not capable of generalizing to an unseen test set. In this paper, we provide an out-of-sample extension method for the random forest-based supervised dimensionality reduction method, RF-PHATE, combining information learned from the random forest model with the function-learning capabilities of autoencoders. Through quantitative assessment of various autoencoder architectures, we identify that networks that reconstruct random forest proximities are more robust for the embedding extension problem. Furthermore, by leveraging proximity-based prototypes, we achieve a 40% reduction in training time without compromising extension quality. Our method does not require label information for out-of-sample points, thus serving as a semi-supervised method, and can achieve consistent quality using only 10% of the training data.

2023-12-31

MLSP (publié)

doi.org

arxiv.org

Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering

Vaibhav Adlakha

Parishad BehnamGhader

Xing Han Lu

Nicholas Meade

Siva Reddy

Retriever-augmented instruction-following models are attractive alternatives to fine-tuned approaches for information-seeking tasks such as … (voir plus)question answering (QA). By simply prepending retrieved documents in its input along with an instruction, these models can be adapted to various information domains and tasks without additional fine-tuning. While the model responses tend to be natural and fluent, the additional verbosity makes traditional QA evaluation metrics such as exact match (EM) and F1 unreliable for accurately quantifying model performance. In this work, we investigate the performance of instruction-following models across three information-seeking QA tasks. We use both automatic and human evaluation to evaluate these models along two dimensions: 1) how well they satisfy the user's information need (correctness), and 2) whether they produce a response based on the provided knowledge (faithfulness). Guided by human evaluation and analysis, we highlight the shortcomings of traditional metrics for both correctness and faithfulness. We then propose simple token-overlap based and model-based metrics that reflect the true performance of these models. Our analysis reveals that instruction-following models are competitive, and sometimes even outperform fine-tuned models for correctness. However, these models struggle to stick to the provided knowledge and often hallucinate in their responses. We hope our work encourages a more holistic evaluation of instruction-following models for QA. Our code and data is available at https://github.com/McGill-NLP/instruct-qa

2023-12-31

Transactions of the Association for Computational Linguistics (publié)

doi.org

arxiv.org

Evaluating Supervision Levels Trade-Offs for Infrared-Based People Counting

David Latortue

Moetez Kdayem

Fidel A. Guerrero Peña

Eric Granger

Marco Pedersoli

Object detection models are commonly used for people counting (and localization) in many applications but require a dataset with costly boun… (voir plus)ding box annotations for training. Given the importance of privacy in people counting, these models rely more and more on infrared images, making the task even harder. In this paper, we explore how weaker levels of supervision affect the performance of deep person counting architectures for image classification and point-level localization. Our experiments indicate that counting people using a convolutional neural network with image-level annotation achieves a level of accuracy that is competitive with YOLO detectors and point-level localization models yet provides a higher frame rate and a simi-lar amount of model parameters. Our code is available at: https://github.com/tortueTortue/IRPeopleCounting.

2023-12-31

2024 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW) (publié)

doi.org

arxiv.org

Evaluating WMT 2024 Metrics Shared Task Submissions on AfriMTE (the African Challenge Set)

Jiayi Wang

David Ifeoluwa Adelani

Pontus Stenetorp

2023-12-31

Conference on Machine Translation (publié)

doi.org

Evaluation algorithmique inclusive de la qualité des espaces publics

Shin Koseki

Toumadher Ammar

Rashid Ahmad Mushkani

Hugo Berard

Sarah Tannir

2023-12-31

SHS Web of Conferences (publié)

doi.org

An Evaluation of Language Models for Hyperpartisan Ideology Detection in Persian Twitter

Sahar Omidi Shayegan

Isar Nejadgholi

Kellin Pelrine

Hao Yu

Sacha Lévy

Zachary Yang

Jean-François Godbout

Reihaneh Rabbany

Large Language Models (LLMs) have shown significant promise in various tasks, including identifying the political beliefs of English-speakin… (voir plus)g social media users from their posts. However, assessing LLMs for this task in non-English languages remains unexplored. In this work, we ask to what extent LLMs can predict the political ideologies of users in Persian social media. To answer this question, we first acknowledge that political parties are not well-defined among Persian users, and therefore, we simplify the task to a much simpler task of hyperpartisan ideology detection. We create a new benchmark and show the potential and limitations of both open-source and commercial LLMs in classifying the hyper-partisan ideologies of users. We compare these models with smaller fine-tuned models, both on the Persian language (ParsBERT) and translated data (RoBERTa), showing that they considerably outperform generative LLMs in this task. We further demonstrate that the performance of the generative LLMs degrades when classifying users based on their tweets instead of their bios and even when tweets are added as additional information, whereas the smaller fine-tuned models are robust and achieve similar performance for all classes. This study is a first step toward political ideology detection in Persian Twitter, with implications for future research to understand the dynamics of ideologies in Persian social media.

2023-12-31

EURALI (publié)

www.semanticscholar.org

An Exact Method for (Constrained) Assortment Optimization Problems with Product Costs

Markus Leitner

Andrea Lodi

Roberto Roberti

Claudio Sole

2023-12-31

INFORMS J. Comput. (publié)

doi.org

arxiv.org