Publications

InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation

Gaurav Sahu

Abhay Puri

Juan A. Rodriguez

Alexandre Drouin

Perouz Taslakian

Valentina Zantedeschi

Alexandre Lacoste

David Vazquez

Nicolas Chapados

Chris Pal

Sai Rajeswar

Issam Hadj Laradji

Data analytics is essential for extracting valuable insights from data that can assist organizations in making effective decisions. We intro… (see more)duce InsightBench, a benchmark dataset with three key features. First, it consists of 100 datasets representing diverse business use cases such as finance and incident management, each accompanied by a carefully curated set of insights planted in the datasets. Second, unlike existing benchmarks focusing on answering single queries, InsightBench evaluates agents based on their ability to perform end-to-end data analytics, including formulating questions, interpreting answers, and generating a summary of insights and actionable steps. Third, we conducted comprehensive quality assurance to ensure that each dataset in the benchmark had clear goals and included relevant and meaningful questions and analysis. Furthermore, we implement a two-way evaluation mechanism using LLaMA-3 as an effective, open-source evaluator to assess agents' ability to extract insights. We also propose AgentPoirot, our baseline data analysis agent capable of performing end-to-end data analytics. Our evaluation on InsightBench shows that AgentPoirot outperforms existing approaches (such as Pandas Agent) that focus on resolving single queries. We also compare the performance of open- and closed-source LLMs and various evaluation strategies. Overall, this benchmark serves as a testbed to motivate further development in comprehensive automated data analytics.

2024-07-08

ArXiv (preprint)

doi.org

arxiv.org

InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation

Gaurav Sahu

Abhay Puri

Juan A. Rodriguez

Alexandre Drouin

Perouz Taslakian

Valentina Zantedeschi

Alexandre Lacoste

David Vazquez

Nicolas Chapados

Chris Pal

Sai Rajeswar

Issam Hadj Laradji

Data analytics is essential for extracting valuable insights from data that can assist organizations in making effective decisions. We intro… (see more)duce InsightBench, a benchmark dataset with three key features. First, it consists of 100 datasets representing diverse business use cases such as finance and incident management, each accompanied by a carefully curated set of insights planted in the datasets. Second, unlike existing benchmarks focusing on answering single queries, InsightBench evaluates agents based on their ability to perform end-to-end data analytics, including formulating questions, interpreting answers, and generating a summary of insights and actionable steps. Third, we conducted comprehensive quality assurance to ensure that each dataset in the benchmark had clear goals and included relevant and meaningful questions and analysis. Furthermore, we implement a two-way evaluation mechanism using LLaMA-3 as an effective, open-source evaluator to assess agents' ability to extract insights. We also propose AgentPoirot, our baseline data analysis agent capable of performing end-to-end data analytics. Our evaluation on InsightBench shows that AgentPoirot outperforms existing approaches (such as Pandas Agent) that focus on resolving single queries. We also compare the performance of open- and closed-source LLMs and various evaluation strategies. Overall, this benchmark serves as a testbed to motivate further development in comprehensive automated data analytics.

2024-07-08

ArXiv (preprint)

doi.org

arxiv.org

Interacting Diffusion Processes for Event Sequence Forecasting

Mai Zeng

Florence Regol

Mark Coates

Neural Temporal Point Processes (TPPs) have emerged as the primary framework for predicting sequences of events that occur at irregular time… (see more) intervals, but their sequential nature can hamper performance for long-horizon forecasts. To address this, we introduce a novel approach that incorporates a diffusion generative model. The model facilitates sequence-to-sequence prediction, allowing multi-step predictions based on historical event sequences. In contrast to previous approaches, our model directly learns the joint probability distribution of types and inter-arrival times for multiple events. The model is composed of two diffusion processes, one for the time intervals and one for the event types. These processes interact through their respective denoising functions, which can take as input intermediate representations from both processes, allowing the model to learn complex interactions. We demonstrate that our proposal outperforms state-of-the-art baselines for long-horizon forecasting of TPPs.

2024-07-08

Proceedings of the 41st International Conference on Machine Learning (published)

doi.org

openreview.net

Layerwise Proximal Replay: A Proximal Point Method for Online Continual Learning

Jinsoo Yoo

Yunpeng Liu

Frank N. Wood

Geoff Pleiss

2024-07-08

Proceedings of the 41st International Conference on Machine Learning (published)

doi.org

openreview.net

Leveraging Transformers for Weakly Supervised Object Localization in Unconstrained Videos

Shakeeb Murtaza

Marco Pedersoli

Aydin Sarraf

Eric Granger

Weakly-Supervised Video Object Localization (WSVOL) involves localizing an object in videos using only video-level labels, also referred to … (see more)as tags. State-of-the-art WSVOL methods like Temporal CAM (TCAM) rely on class activation mapping (CAM) and typically require a pre-trained CNN classifier. However, their localization accuracy is affected by their tendency to minimize the mutual information between different instances of a class and exploit temporal information during training for downstream tasks, e.g., detection and tracking. In the absence of bounding box annotation, it is challenging to exploit precise information about objects from temporal cues because the model struggles to locate objects over time. To address these issues, a novel method called transformer based CAM for videos (TrCAM-V), is proposed for WSVOL. It consists of a DeiT backbone with two heads for classification and localization. The classification head is trained using standard classification loss (CL), while the localization head is trained using pseudo-labels that are extracted using a pre-trained CLIP model. From these pseudo-labels, the high and low activation values are considered to be foreground and background regions, respectively. Our TrCAM-V method allows training a localization network by sampling pseudo-pixels on the fly from these regions. Additionally, a conditional random field (CRF) loss is employed to align the object boundaries with the foreground map. During inference, the model can process individual frames for real-time localization applications. Extensive experiments on challenging YouTube-Objects unconstrained video datasets show that our TrCAM-V method achieves new state-of-the-art performance in terms of classification and localization accuracy.

2024-07-08

ArXiv (preprint)

doi.org

arxiv.org

Listenable Maps for Audio Classifiers

Francesco Paissan

Mirco Ravanelli

Cem Subakan

2024-07-08

Proceedings of the 41st International Conference on Machine Learning (published)

doi.org

openreview.net

Lookbehind-SAM: k steps back, 1 step forward

Goncalo Mordido

Pranshu Malviya

Aristide Baratin

Sarath Chandar

2024-07-08

Proceedings of the 41st International Conference on Machine Learning (published)

proceedings.mlr.press

openreview.net

Memory Efficient Neural Processes via Constant Memory Attention Block

Leo Feng

Frederick Tung

Hossein Hajimirsadeghi

Yoshua Bengio

Mohamed Osama Ahmed

2024-07-08

Proceedings of the 41st International Conference on Machine Learning (published)

proceedings.mlr.press

openreview.net

Modeling Caption Diversity in Contrastive Vision-Language Pretraining

Samuel Lavoie

Polina Kirichenko

Mark Ibrahim

Mahmoud Assran

Andrew Gordon Wilson

Aaron Courville

Nicolas Ballas

There are a thousand ways to caption an image. Contrastive Language Pretraining (CLIP) on the other hand, works by mapping an image and its … (see more)caption to a single vector -- limiting how well CLIP-like models can represent the diverse ways to describe an image. In this work, we introduce Llip, Latent Language Image Pretraining, which models the diversity of captions that could match an image. Llip's vision encoder outputs a set of visual features that are mixed into a final representation by conditioning on information derived from the text. We show that Llip outperforms non-contextualized baselines like CLIP and SigLIP on a variety of tasks even with large-scale encoders. Llip improves zero-shot classification by an average of 2.9\% zero-shot classification benchmarks with a ViT-G/14 encoder. Specifically, Llip attains a zero-shot top-1 accuracy of 83.5\% on ImageNet outperforming a similarly sized CLIP by 1.4\%. We also demonstrate improvement on zero-shot retrieval on MS-COCO by 6.0\%. We provide a comprehensive analysis of the components introduced by the method and demonstrate that Llip leads to richer visual representations.

2024-07-08

Proceedings of the 41st International Conference on Machine Learning (published)

doi.org

openreview.net

Nearest Neighbour Score Estimators for Diffusion Generative Models

Matthew Niedoba

Dylan Green

Saeid Naderiparizi

Vasileios Lioutas

Jonathan Wilder Lavington

Xiaoxuan Liang

Yunpeng Liu

Ke Zhang

Setareh Dabiri

Adam Ścibior

Berend Zwartsenberg

Frank N. Wood

Score function estimation is the cornerstone of both training and sampling from diffusion generative models. Despite this fact, the most com… (see more)monly used estimators are either biased neural network approximations or high variance Monte Carlo estimators based on the conditional score. We introduce a novel nearest neighbour score function estimator which utilizes multiple samples from the training set to dramatically decrease estimator variance. We leverage our low variance estimator in two compelling applications. Training consistency models with our estimator, we report a significant increase in both convergence speed and sample quality. In diffusion models, we show that our estimator can replace a learned network for probability-flow ODE integration, opening promising new avenues of future research. Code will be released upon paper acceptance.

2024-07-08

Proceedings of the 41st International Conference on Machine Learning (published)

doi.org

openreview.net

A Persuasive Approach to Combating Misinformation

Safwan Hossain

Andjela Mladenovic

Yiling Chen

Gauthier Gidel

Bayesian Persuasion is proposed as a tool for social media platforms to combat the spread of misinformation. Since platforms can use machine… (see more) learning to predict the popularity and misinformation features of to-be-shared posts, and users are largely motivated to share popular content, platforms can strategically signal this informational advantage to change user beliefs and persuade them not to share misinformation. We characterize the optimal signaling scheme with imperfect predictions as a linear program and give sufficient and necessary conditions on the classifier to ensure optimal platform utility is non-decreasing and continuous. Next, this interaction is considered under a performative model, wherein platform intervention affects the user's future behaviour. The convergence and stability of optimal signaling under this performative process are fully characterized. Lastly, we experimentally validate that our approach significantly reduces misinformation in both the single round and performative setting.

2024-07-08

Proceedings of the 41st International Conference on Machine Learning (published)

doi.org

openreview.net

Position: Cracking the Code of Cascading Disparity Towards Marginalized Communities

Golnoosh Farnadi

Mohammad Havaei

Negar Rostamzadeh

2024-07-08

Proceedings of the 41st International Conference on Machine Learning (published)

doi.org

openreview.net