Portrait of Tianyu Li is unavailable

Tianyu Li

Alumni

Publications

Hallucination Detection and Hallucination Mitigation: An Investigation
Connecting Weighted Automata, Tensor Networks and Recurrent Neural Networks through Spectral Learning
In this paper, we present connections between three models used in different research fields: weighted finite automata~(WFA) from formal lan… (see more)guages and linguistics, recurrent neural networks used in machine learning, and tensor networks which encompasses a set of optimization techniques for high-order tensors used in quantum physics and numerical analysis. We first present an intrinsic relation between WFA and the tensor train decomposition, a particular form of tensor network. This relation allows us to exhibit a novel low rank structure of the Hankel matrix of a function computed by a WFA and to design an efficient spectral learning algorithm leveraging this structure to scale the algorithm up to very large Hankel matrices.We then unravel a fundamental connection between WFA and second-orderrecurrent neural networks~(2-RNN): in the case of sequences of discrete symbols, WFA and 2-RNN with linear activationfunctions are expressively equivalent. Leveraging this equivalence result combined with the classical spectral learning algorithm for weighted automata, we introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous input vectors.This algorithm relies on estimating low rank sub-blocks of the Hankel tensor, from which the parameters of a linear 2-RNN can be provably recovered. The performances of the proposed learning algorithm are assessed in a simulation study on both synthetic and real-world data.
Sequential Density Estimation via NCWFAs Sequential Density Estimation via Nonlinear Continuous Weighted Finite Automata
Weighted finite automata (WFAs) have been widely applied in many fields. One of the classic problems for WFAs is probability distribution es… (see more)timation over sequences of discrete symbols. Although WFAs have been extended to deal with continuous input data, namely continuous WFAs (CWFAs), it is still unclear how to approximate density functions over sequences of continuous random variables using WFA-based models, due to the limitation on the expressiveness of the model as well as the tractability of approximating density functions via CWFAs. In this paper, we propose a nonlinear extension to the CWFA model to first improve its expressiveness, we refer to it as the nonlinear continuous WFAs (NCWFAs). Then we leverage the so-called RNADE method, which is a well-known density estimator based on neural networks, and propose the RNADE-NCWFA model. The RNADE-NCWFA model computes a density function by design. We show that this model is strictly more expressive than the Gaussian HMM model, which CWFA cannot approximate. Empirically, we conduct a synthetic experiment using Gaussian HMM generated data. We focus on evaluating the model's ability to estimate densities for sequences of varying lengths (longer length than the training data). We observe that our model performs the best among the compared baseline methods.
Low-Rank Representation of Reinforcement Learning Policies
We propose a general framework for policy representation for reinforcement learning tasks. This framework involves finding a low-dimensional… (see more) embedding of the policy on a reproducing kernel Hilbert space (RKHS). The usage of RKHS based methods allows us to derive strong theoretical guarantees on the expected return of the reconstructed policy. Such guarantees are typically lacking in black-box models, but are very desirable in tasks requiring stability and convergence guarantees. We conduct several experiments on classic RL domains. The results confirm that the policies can be robustly represented in a low-dimensional space while the embedded policy incurs almost no decrease in returns.
MBAIL: Multi-Batch Best Action Imitation Learning utilizing Sample Transfer and Policy Distillation
Dingwei Wu
M. Jenkin
Steve Liu
Batch reinforcement learning (RL) aims to learn a good control policy from a previously collected dataset without requiring additional inter… (see more)actions with the environment. Unfortunately, in the real world, we may only have a limited amount of training data for tasks we are interested in. Most batch RL methods are intended to learn a policy over one fixed dataset, and are not intended to learn a policy that can perform well over other tasks. How can we leverage the advantages of batch RL while dealing with limited training data is another challenge in real world. In this work, we propose to add sample transfer and policy distillation to a leading Batch RL approach. The proposed methods are evaluated on multiple control tasks to showcase their effectiveness.
Representation of Reinforcement Learning Policies in Reproducing Kernel Hilbert Spaces
We propose a general framework for policy representation for reinforcement learning tasks. This framework involves finding a low-dimensional… (see more) embedding of the policy on a reproducing kernel Hilbert space (RKHS). The usage of RKHS based methods allows us to derive strong theoretical guarantees on the expected return of the reconstructed policy. Such guarantees are typically lacking in black-box models, but are very desirable in tasks requiring stability. We conduct several experiments on classic RL domains. The results confirm that the policies can be robustly embedded in a low-dimensional space while the embedded policy incurs almost no decrease in return.
Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning
Connecting Weighted Automata and Recurrent Neural Networks through Spectral Learning
In this paper, we unravel a fundamental connection between weighted finite automata~(WFAs) and second-order recurrent neural networks~(2-RNN… (see more)s): in the case of sequences of discrete symbols, WFAs and 2-RNNs with linear activation functions are expressively equivalent. Motivated by this result, we build upon a recent extension of the spectral learning algorithm to vector-valued WFAs and propose the first provable learning algorithm for linear 2-RNNs defined over sequences of continuous input vectors. This algorithm relies on estimating low rank sub-blocks of the so-called Hankel tensor, from which the parameters of a linear 2-RNN can be provably recovered. The performances of the proposed method are assessed in a simulation study.
Nonlinear Weighted Finite Automata
Neural Network Based Nonlinear Weighted Finite Automata
Weighted finite automata (WFA) can expressively model functions defined over strings but are inherently linear models. Given the recent succ… (see more)esses of nonlinear models in machine learning, it is natural to wonder whether ex-tending WFA to the nonlinear setting would be beneficial. In this paper, we propose a novel model of neural network based nonlinearWFA model (NL-WFA) along with a learning algorithm. Our learning algorithm is inspired by the spectral learning algorithm for WFAand relies on a nonlinear decomposition of the so-called Hankel matrix, by means of an auto-encoder network. The expressive power of NL-WFA and the proposed learning algorithm are assessed on both synthetic and real-world data, showing that NL-WFA can lead to smaller model sizes and infer complex grammatical structures from data.