Discrete Time Stochastic Root-finding with Forced Stopping Time

Stochastic Root-finding with Markov Decision Process

Authors

  • Ali Nasir Department of Electrical Engineering, University of Central Punjab, Lahore, Pakistan
  • Huma Rehman Baig Directorate of Research, University of Central Punjab, Lahore, Pakistan

Keywords:

Markov decision processes, Stochastic root finding, Stochastic approximation

Abstract

In this paper, we present a Markov Decision Process (MDP) based formulation for solving the stochastic root-finding problem with predefined stopping time. In the problem that we pose, we need to find only one root of a given finite valued function f (p,u,h). Here, p is a known Markov chain, u is the adjustable variable, and h is the unknown random variable with known distribution. Hence we cannot measure the true value of the function because h is unknown. We assume that we have a way of measuring whether or not f is within some bound ε from zero. We also present a formulation for the problem where f (p,u) is measurable but the adjustment of u is stochastic with known distribution. Another challenge in our problem is the introduction of finite stopping time. This means that the MDP policy has only a predefined finite number of actions available for adjusting u to find the root (or bring | f | < ε). We have included a price control example in the paper to demonstrate the behavior of the resulting MDP policy in response to the available time steps and the variable values of u and p. The results show reasonable trajectories for our simulated environment.

References

H. Robbins, and S. Monro. A stochastic approximation method. Annals of Mathematics and Statistics 29, 373-405 (1951).

C.F.J. Wu. Efficient sequential designs with binary data. Journal of American Statistical Association 80 974-984 (1985).

A. Yazidi, and B.J. Oommen. A novel technique for stochastic root-finding: Enhancing the search with the adaptive d-ary search. Information Sciences,393: 108-129 (2017).

S.M.Vahidipour., M.R. Meybodi, and M. Esnaashari. Finding the shortest path in stochastic graphs using learning automata and adaptive stochastic petri nets. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 25(03) 427-455 (2017).

S. Glimsdal, and O.C. Granmo, Thompson Sampling Guided Stochastic Searching on the Line for Deceptive Environments with Applications to Root-Finding Problems. arXiv preprint arXiv:1708.01791 (2017).

J. Zhang., Y. Wang., C. Wang, and M. Zhou. Symmetrical hierarchical stochastic searching on the line in informative and deceptive environments. IEEE transactions on cybernetics, 47(3) 626-635 (2017).

T. Pfeffer, and L. Pollet. A stochastic root finding approach: the homotopy analysis method applied to Dyson–Schwinger equations. New Journal of Physics. 19(4): (2017).

P.R. Kumar, and P. Varaiya, Stochastic Systems: Estimation, Identification, and Adaptive Control, Prentice Hall Inc., Englewood Cliffs, New Jersey 07632, 1986.

M.L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming, © John Wiley and Sons Inc. (1994).

S. Russell, and P. Norvig. Artificial Intelligence: A Modern Approach, 2nd Edition, Prentice-Hall, Upper Saddle River, New Jersey 07458, (2005).

P. Raghu, and S. Kim. The stochastic root-finding problem: Overview, solutions, and open questions. ACM Transactions on Modeling and Computer Simulation (TOMACS) 21(3) 1-23 (2011).

Downloads

Published

2021-03-09

How to Cite

Nasir, A., & Baig, H. R. (2021). Discrete Time Stochastic Root-finding with Forced Stopping Time: Stochastic Root-finding with Markov Decision Process. Proceedings of the Pakistan Academy of Sciences: A. Physical and Computational Sciences, 57(2), 81–88. Retrieved from https://ppaspk.org/index.php/PPAS-A/article/view/24

Issue

Section

Articles