Adaptive dynamic programming for the optimal liver regeneration control

doi:10.26565/2221-5646-2024-99-02

Valeriia Karieva Харківський національний університет імені В. Н. Каразіна https://orcid.org/0000-0003-2121-5214
Sergey Lvov V. N. Karazin Kharkiv National University http://orcid.org/0000-0003-4055-7172

DOI: https://doi.org/10.26565/2221-5646-2024-99-02

Keywords: Dynamic programming, Optimal control, Reinforcement learning

Abstract

Every living organism interacts with an environment and uses that interaction for an improvement of its own adaptability, and, as a result, one’s survival and overall development. The process of evolution shows us that different species change methods of interaction with an environment with passage of time, which leads to natural selection and survival of the most adaptive ones. This learning, which based on actions, or reinforcement learning may embrace the idea of optimal behavior occurring in environmental systems. We describe mathematical formulas for reinforcement learning and the practical integration method also known as adaptive dynamic programming. That gives us the overall concept of controllers for artificial biological systems that both learn and show the optimal behavior.

This paper reviews the formulation of the upper optimality problem, for which the optimal regulation strategy is guaranteed to be better or equivalent to objective regulation rules that can be observed in natural biological systems.

In cases of optimal reinforcement learning algorithms the learning process itself moves from the analysis of the item take on system dynamics to the much higher level. The object of interest now is not the details of the system dynamics, but the quantity efficiency index, which clearly represents how optimally the control system works. Such scheme of reinforcement learning is learning technique of optimal behavior in order to monitor the response to non-optimal control strategies.

The purpose of this article is to show the possibility of using of reinforcement learning methods, the adaptive dynamic programming (ADP) in particular, to control biological systems using feedback. This article shows the on-line methods for solving the problem of searching the upper optimality estimate with adaptive dynamic programming.

Downloads

Download data is not yet available.

References

E.T. Liu. Systems biology, integrative biology, predictive biology. Cell. - 2005. - Vol. 121(4). - P. 505-506. DOI: https://doi.org/10.1016/j.cell.2005.04.021

J.M. Smith. Optimization theory in evolution. Annu Rev Ecol Syst. - 1978. - Vol. 9(1). - P.31-56. DOI: https://doi.org/10.1146/annurev.es.09.110178.000335

G.A. Parker, J.M. Smith et al. Optimality theory in evolutionary biology. Nature. - 1990. - Vol. 348(6296). - P.27-33. DOI: https://doi.org/10.1038/348027a0

V. V. Karieva, S. V. Lvov. Mathematical model of liver regeneration processes: homogeneous approximation. Visnyk of V.N.Karazin Kharkiv National University. Ser. ``Mathematics, Applied Mathematics and Mechanics''. - 2018. - Vol. 87. - P. 29-41. DOI: https://doi.org/10.26565/2221-5646-2018-87-03

V. V. Karieva, S. V. Lvov, L. P. Artyukhova. Different strategies in the liver regeneration processes. Numerical experiments on the mathematical model. Visnyk of V.N.Karazin Kharkiv National University. Ser. ``Mathematics, Applied Mathematics and Mechanics''. - 2020. - Vol. 91. - P. 36-44. DOI: https://doi.org/10.26565/2221-5646-2020-91-03

J. M. Mendel, R. W. McLaren. Reinforcement-Learning Control and Pattern Recognition Systems. Mathematics in Science and Engineering. -- 1970. -- Vol. 66. -- P.~287--318. DOI: https://doi.org/10.1016/S0076-5392(08)60497-X

V. V. Karieva, S. V. Lvov. Liver regeneration after partial hepatectomy: the upper optimality estimate. Visnyk of V.N.Karazin Kharkiv National University. Ser. ``Mathematics, Applied Mathematics and Mechanics''. - 2023. - Vol. 97. - P.41-58. DOI: https://doi.org/10.26565/2221-5646-2023-97-04

R. E. Bellman. Dynamic Programming. Princeton, NJ: Princeton Univ. -- 1957. -- 392 p. ISBN: https://press.princeton.edu/books/paperback/9780691146683/dynamic-programming

R. S. Sutton, A. G. Barto. Reinforcement Learning—An Introduction. Cambridge, MA: MIT Press. -- 1998. -- 526 p. ISBN: https://www.andrew.cmu.edu/course/10-703/textbook/BartoSutton.pdf

D. P. Bertsekas, J. N. Tsitsiklis. Neuro-Dynamic Programming. MA: Athena Scientific. -- 1996. -- 512 p. DOI: https://doi.org/10.1007/978-0-387-74759-0_440

W. T. Miller III, R. S. Sutton, P. J. Werbos. Neural Networks for Control. The MIT Press. - 1995. - 544 p. ISBN: https://mitpress.mit.edu/9780262631617/neural-networks-for-control/

F. L. Lewis, D. L. Vrabie. Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits and Systems Magazine. - 2009. - Vol. 9(3). - P. 32-50. DOI: https://doi.org/10.1109/MCAS.2009.933854

Al-Tamimi, F. L. Lewis, M. Abu-Khalaf. Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control. Automatica. - 2007. - Vol. 43. - P. 473-481. DOI: https://doi.org/10.1016/j.automatica.2006.09.019

C. J. C. H. Watkins, P. Dayan. Q-learning. Machine Learning. - 1992. - Vol. 8. - P. 279-292. DOI: https://doi.org/10.1007/BF00992698