Mark Coates

Apprentissage bidirectionnel pour l’optimisation hors ligne basée sur un modèle

Doctorat - McGill

Doctorat - McGill

regol Regol

Doctorat

Doctorat - McGill

Co-superviseur⋅e :

Doctorat - McGill

Doctorat

Billets de blogue

Bidirectional Learning for Offline Model-based Optimization

20 septembre 2023

par

Can Chen

Yingxue Zhang

Xue Liu

Mark Coates

Lire l'article

Publications

Hint Marginalization for Improved Reasoning in Large Language Models

Soumyasundar Pal

Didier Ch'etelat

Yingxue Zhang

Large Language Models (LLMs) have exhibited an impressive capability to perform reasoning tasks, especially if they are encouraged to genera… (voir plus)te a sequence of intermediate steps. Reasoning performance can be improved by suitably combining multiple LLM responses, generated either in parallel in a single query, or via sequential interactions with LLMs throughout the reasoning process. Existing strategies for combination, such as self-consistency and progressive-hint-prompting, make inefficient usage of the LLM responses. We present Hint Marginalization, a novel and principled algorithmic framework to enhance the reasoning capabilities of LLMs. Our approach can be viewed as an iterative sampling strategy for forming a Monte Carlo approximation of an underlying distribution of answers, with the goal of identifying the mode the most likely answer. Empirical evaluation on several benchmark datasets for arithmetic reasoning demonstrates the superiority of the proposed approach.

2024-12-17

ArXiv (prépublication)

Graph Knowledge Distillation to Mixture of Experts

Pavel Rumiantsev

2024-10-22

TMLR (accepté)

HardCore Generation: Generating Hard UNSAT Problems for Data Augmentation

Joseph Cotnareanu

Zhanguang Zhang

Hui-Ling Zhen

Yingxue Zhang

Efficiently determining the satisfiability of a boolean equation --- known as the SAT problem for brevity --- is crucial in various industri… (voir plus)al problems. Recently, the advent of deep learning methods has introduced significant potential for enhancing SAT solving. However, a major barrier to the advancement of this field has been the scarcity of large, realistic datasets. The majority of current public datasets are either randomly generated or extremely limited, containing only a few examples from unrelated problem families. These datasets are inadequate for meaningful training of deep learning methods. In light of this, researchers have started exploring generative techniques to create data that more accurately reflect SAT problems encountered in practical situations. These methods have so far suffered from either the inability to produce challenging SAT problems or time-scalability obstacles. In this paper we address both by identifying and manipulating the key contributors to a problem's ``hardness'', known as cores. Although some previous work has addressed cores, the time costs are unacceptably high due to the expense of traditional heuristic core detection techniques. We introduce a fast core detection procedure that uses a graph neural network. Our empirical results demonstrate that we can efficiently generate problems that remain hard to solve and retain key attributes of the original example problems. We show via experiment that the generated synthetic SAT problems can be used in a data augmentation setting to provide improved prediction of solver runtimes.

2024-09-25

NeurIPS.cc/2024/Conference (poster)

Enhancing Logical Reasoning in Large Language Models through Graph-based Synthetic Data

Jiaming Zhou

Abbas Ghaddar

Ge Zhang

Yaochen Hu

Soumyasundar Pal

Bin Wang

Yingxue Zhang

Jianye HAO

Despite recent advances in training and prompting strategies for Large Language Models (LLMs), these models continue to face challenges with… (voir plus) complex logical reasoning tasks that involve long reasoning chains. In this work, we explore the potential and limitations of using graph-based synthetic reasoning data as training signals to enhance LLMs' reasoning capabilities. Our extensive experiments, conducted on two established natural language reasoning tasks -- inductive reasoning and spatial reasoning -- demonstrate that supervised fine-tuning (SFT) with synthetic graph-based reasoning data effectively enhances LLMs' reasoning performance without compromising their effectiveness on other standard evaluation benchmarks.

2024-09-19

ArXiv (prépublication)

CKGConv: General Graph Convolution with Continuous Kernels

Soumyasundar Pal

Yitian Zhang

Jiaming Zhou

Yingxue Zhang

The existing definitions of graph convolution, either from spatial or spectral perspectives, are inflexible and not unified. Defining a gene… (voir plus)ral convolution operator in the graph domain is challenging due to the lack of canonical coordinates, the presence of irregular structures, and the properties of graph symmetries. In this work, we propose a novel and general graph convolution framework by parameterizing the kernels as continuous functions of pseudo-coordinates derived via graph positional encoding. We name this Continuous Kernel Graph Convolution (CKGConv). Theoretically, we demonstrate that CKGConv is flexible and expressive. CKGConv encompasses many existing graph convolutions, and exhibits a stronger expressiveness, as powerful as graph transformers in terms of distinguishing non-isomorphic graphs. Empirically, we show that CKGConv-based Networks outperform existing graph convolutional networks and perform comparably to the best graph transformers across a variety of graph datasets. The code and models are publicly available at https://github.com/networkslab/CKGConv.

2024-07-08

Proceedings of the 41st International Conference on Machine Learning (publié)

Interacting Diffusion Processes for Event Sequence Forecasting

Mai Zeng

Florence Regol

Neural Temporal Point Processes (TPPs) have emerged as the primary framework for predicting sequences of events that occur at irregular time… (voir plus) intervals, but their sequential nature can hamper performance for long-horizon forecasts. To address this, we introduce a novel approach that incorporates a diffusion generative model. The model facilitates sequence-to-sequence prediction, allowing multi-step predictions based on historical event sequences. In contrast to previous approaches, our model directly learns the joint probability distribution of types and inter-arrival times for multiple events. The model is composed of two diffusion processes, one for the time intervals and one for the event types. These processes interact through their respective denoising functions, which can take as input intermediate representations from both processes, allowing the model to learn complex interactions. We demonstrate that our proposal outperforms state-of-the-art baselines for long-horizon forecasting of TPPs.

2024-07-08

Proceedings of the 41st International Conference on Machine Learning (publié)

CKGConv: General Graph Convolution with Continuous Kernels

Soumyasundar Pal

Yitian Zhang

Jiaming Zhou

Yingxue Zhang

The existing definitions of graph convolution, either from spatial or spectral perspectives, are inflexible and not unified. Defining a gene… (voir plus)ral convolution operator in the graph domain is challenging due to the lack of canonical coordinates, the presence of irregular structures, and the properties of graph symmetries. In this work, we propose a novel graph convolution framework by parameterizing the kernels as continuous functions of pseudo-coordinates derived via graph positional encoding. We name this Continuous Kernel Graph Convolution (CKGConv). Theoretically, we demonstrate that CKGConv is flexible and expressive. CKGConv encompasses many existing graph convolutions, and exhibits the same expressiveness as graph transformers in terms of distinguishing non-isomorphic graphs. Empirically, we show that CKGConv-based Networks outperform existing graph convolutional networks and perform comparably to the best graph transformers across a variety of graph datasets.

2024-05-01

ICML.cc/2024/Conference (poster)

Categorical Generative Model Evaluation via Synthetic Distribution Coarsening

Florence Regol

As we expect to see a rapid integration of generative models in our day to day lives, the development of rigorous methods of evaluation and … (voir plus)analysis for generative models has never been more pressing. Multiple works have highlighted the shortcomings of widely used metrics and exposed how they fail to behave as expected in some settings. So far, the response has been to use a variety of metrics that target different desirable and interpretable properties such as fidelity, diversity, and authenticity, to obtain a clearer picture of a generative model’s capabilities. These methods mainly focus on ordinal data and they all suffer from the same unavoidable issues stemming from estimating quantities of high-dimensional data from a limited number of samples. We propose to take an alternative approach and to return to the synthetic data setting where the ground truth is explicit and known. We focus on nominal categorical data and introduce an evaluation method that can scale to the high-dimensional settings often encountered in practice. Our method involves successively binning the large space to obtain smaller probability spaces and coarser distributions where meaningful statistical estimates can be obtained. This allows us to provide probabilistic guarantees and sample complexities and we illustrate how our method can be applied to distinguish between the capabilities of several state-of-the-art categorical models.

2024-04-18

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics (publié)

proceedings.mlr.press

Multi-resolution Time-Series Transformer for Long-term Forecasting

Yitian Zhang

Soumyasundar Pal

Yingxue Zhang

2024-04-18

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics (publié)

Jointly-Learned Exit and Inference for a Dynamic Neural Network

Florence Regol

Joud Chataoui

2024-01-16

ICLR.cc/2024/Conference (poster)

DyG2Vec: Efficient Representation Learning for Dynamic Graphs

Mohammad Alomrani

Mahdi Biparva

Yingxue Zhang

Temporal graph neural networks have shown promising results in learning inductive representations by automatically extracting temporal patte… (voir plus)rns. However, previous works often rely on complex memory modules or inefficient random walk methods to construct temporal representations. To address these limitations, we present an efficient yet effective attention-based encoder that leverages temporal edge encodings and window-based subgraph sampling to generate task-agnostic embeddings. Moreover, we propose a joint-embedding architecture using non-contrastive SSL to learn rich temporal embeddings without labels. Experimental results on 7 benchmark datasets indicate that on average, our model outperforms SoTA baselines on the future link prediction task by 4.23% for the transductive setting and 3.30% for the inductive setting while only requiring 5-10x less training/inference time. Lastly, different aspects of the proposed framework are investigated through experimental analysis and ablation studies. The code is publicly available at https://github.com/huawei-noah/noah-research/tree/master/graph_atlas.

2024-01-08

TMLR (accepté)

Enhancing Click-through Rate Prediction in Recommendation Domain with Search Query Representation

Yuening Wang

Man Chen

Yaochen Hu

Wei Guo

Yingxue Zhang

Huifeng Guo

Yang Liu

2024-01-01

CIKM (publié)