Adaptive dynamic programming for the optimal liver regeneration control
Abstract
Every living organism interacts with an environment and uses that interaction for an improvement of its own adaptability, and, as a result, one’s survival and overall development. The process of evolution shows us that different species change methods of interaction with an environment with passage of time, which leads to natural selection and survival of the most adaptive ones. This learning, which based on actions, or reinforcement learning may embrace the idea of optimal behavior occurring in environmental systems. We describe mathematical formulas for reinforcement learning and the practical integration method also known as adaptive dynamic programming. That gives us the overall concept of controllers for artificial biological systems that both learn and show the optimal behavior.
This paper reviews the formulation of the upper optimality problem, for which the optimal regulation strategy is guaranteed to be better or equivalent to objective regulation rules that can be observed in natural biological systems.
In cases of optimal reinforcement learning algorithms the learning process itself moves from the analysis of the item take on system dynamics to the much higher level. The object of interest now is not the details of the system dynamics, but the quantity efficiency index, which clearly represents how optimally the control system works. Such scheme of reinforcement learning is learning technique of optimal behavior in order to monitor the response to non-optimal control strategies.
The purpose of this article is to show the possibility of using of reinforcement learning methods, the adaptive dynamic programming (ADP) in particular, to control biological systems using feedback. This article shows the on-line methods for solving the problem of searching the upper optimality estimate with adaptive dynamic programming.
Downloads
References
E.T. Liu. Systems biology, integrative biology, predictive biology. Cell. - 2005. - Vol. 121(4). - P. 505-506. DOI: https://doi.org/10.1016/j.cell.2005.04.021
J.M. Smith. Optimization theory in evolution. Annu Rev Ecol Syst. - 1978. - Vol. 9(1). - P.31-56. DOI: https://doi.org/10.1146/annurev.es.09.110178.000335
G.A. Parker, J.M. Smith et al. Optimality theory in evolutionary biology. Nature. - 1990. - Vol. 348(6296). - P.27-33. DOI: https://doi.org/10.1038/348027a0
V. V. Karieva, S. V. Lvov. Mathematical model of liver regeneration processes: homogeneous approximation. Visnyk of V.N.Karazin Kharkiv National University. Ser. ``Mathematics, Applied Mathematics and Mechanics''. - 2018. - Vol. 87. - P. 29-41. DOI: https://doi.org/10.26565/2221-5646-2018-87-03
V. V. Karieva, S. V. Lvov, L. P. Artyukhova. Different strategies in the liver regeneration processes. Numerical experiments on the mathematical model. Visnyk of V.N.Karazin Kharkiv National University. Ser. ``Mathematics, Applied Mathematics and Mechanics''. - 2020. - Vol. 91. - P. 36-44. DOI: https://doi.org/10.26565/2221-5646-2020-91-03
J. M. Mendel, R. W. McLaren. Reinforcement-Learning Control and Pattern Recognition Systems. Mathematics in Science and Engineering. -- 1970. -- Vol. 66. -- P.~287--318. DOI: https://doi.org/10.1016/S0076-5392(08)60497-X
V. V. Karieva, S. V. Lvov. Liver regeneration after partial hepatectomy: the upper optimality estimate. Visnyk of V.N.Karazin Kharkiv National University. Ser. ``Mathematics, Applied Mathematics and Mechanics''. - 2023. - Vol. 97. - P.41-58. DOI: https://doi.org/10.26565/2221-5646-2023-97-04
R. E. Bellman. Dynamic Programming. Princeton, NJ: Princeton Univ. -- 1957. -- 392 p. ISBN: https://press.princeton.edu/books/paperback/9780691146683/dynamic-programming
R. S. Sutton, A. G. Barto. Reinforcement Learning—An Introduction. Cambridge, MA: MIT Press. -- 1998. -- 526 p. ISBN: https://www.andrew.cmu.edu/course/10-703/textbook/BartoSutton.pdf
D. P. Bertsekas, J. N. Tsitsiklis. Neuro-Dynamic Programming. MA: Athena Scientific. -- 1996. -- 512 p. DOI: https://doi.org/10.1007/978-0-387-74759-0_440
W. T. Miller III, R. S. Sutton, P. J. Werbos. Neural Networks for Control. The MIT Press. - 1995. - 544 p. ISBN: https://mitpress.mit.edu/9780262631617/neural-networks-for-control/
F. L. Lewis, D. L. Vrabie. Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits and Systems Magazine. - 2009. - Vol. 9(3). - P. 32-50. DOI: https://doi.org/10.1109/MCAS.2009.933854
Al-Tamimi, F. L. Lewis, M. Abu-Khalaf. Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control. Automatica. - 2007. - Vol. 43. - P. 473-481. DOI: https://doi.org/10.1016/j.automatica.2006.09.019
C. J. C. H. Watkins, P. Dayan. Q-learning. Machine Learning. - 1992. - Vol. 8. - P. 279-292. DOI: https://doi.org/10.1007/BF00992698
Copyright (c) 2024 Валерія Карієва, Сергій Львов
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The copyright holder is the author.
Authors who publish with this journal agree to the following terms:
1. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal. (Attribution-Noncommercial-No Derivative Works licence).
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (see The Effect of Open Access).