Portrait of Berk Bozkurt is unavailable

Berk Bozkurt

Collaborating Alumni - McGill University
Supervisor
Research Topics
Reinforcement Learning
Stochastic Control

Publications

Sub-optimality bounds for certainty equivalent policies in partially observed systems
Ashutosh Nayyar
Yi Ouyang
In this paper, we present a generalization of the certainty equivalence principle of stochastic control. One interpretation of the classical… (see more) certainty equivalence principle for linear systems with output feedback and quadratic costs is as follows: the optimal action at each time is obtained by evaluating the optimal state-feedback policy of the stochastic linear system at the minimum mean square error (MMSE) estimate of the state. Motivated by this interpretation, we consider certainty equivalent policies for general (non-linear) partially observed stochastic systems that allow for any state estimate rather than restricting to MMSE estimates. In such settings, the certainty equivalent policy is not optimal. For models where the cost and the dynamics are smooth in an appropriate sense, we derive upper bounds on the sub-optimality of certainty equivalent policies. We present several examples to illustrate the results.
Generalized certainty equivalence based policies in partially observable systems
Ashutosh Nayyar
Yi Ouyang
In this paper, we present a generalization of the certainty equivalence principle of stochastic control. One interpretation of the classical… (see more) certainty equivalence principle for linear systems with output feedback and quadratic costs is as follows: the optimal action at each time is obtained by evaluating the optimal state-feedback policy of the stochastic linear system at the minimum mean square error (MMSE) estimate of the state. Motivated by this interpretation, we consider certainty equivalent policies for general (non-linear) partially observed stochastic systems and allow for any state estimate rather than restricting to MMSE estimates. In such settings, the certainty equivalent policy is not optimal. For models with Lipschitz cost and dynamics, we derive upper bounds on the sub-optimality of certainty equivalent policies in terms of expected error of the proposed estimator. We present several examples to illustrate the results.
Model approximation in MDPs with unbounded per-step cost
Ashutosh Nayyar
Yi Ouyang
We consider the problem of designing a control policy for an infinite-horizon discounted cost Markov decision process …
Weighted-Norm Bounds on Model Approximation in MDPs with Unbounded Per-Step Cost
Ashutosh Nayyar
Yi Ouyang
We consider the problem of designing a control policy for an infinite-horizon discounted cost Markov Decision Process (MDP) …