Volume 17, Issue 4 (Journal of Control, V.17, N.4 Winter 2024)                   JoC 2024, 17(4): 75-87 | Back to browse issues page

XML Persian Abstract Print


1- Department of Mathematics, Payame Noor University(PNU),P.O.Box19395-4697
2- professor Ferdowsi University of Mashhad
3- Department of Electrical Engineering, Quchan University of Technology
Abstract:   (1416 Views)
Bellman's optimality principle states that designing an optimal controller for continuous-time bilinear systems with known system dynamics has a high computational complexity. As a result, controller design typically uses approximation techniques that depend on system dynamics knowledge. This problem will become more challenging when the system dynamics are unknown. Identifying the bilinear system dynamics through identification techniques is the first step toward overcoming this. It is well known that the identification methods give the designer a linear model to use in the controller design, based on the input and output data of the system. This paper proposes a new iterative method to design an optimal controller for a bilinear system whose dynamics are unknown, using an online adaptive policy iteration. In the proposed iterative method, instead of knowing the dynamics of the bilinear system, the optimal controller is designed by using the online input information and measurement of states. Also, by applying noise as an input for the system in a certain time interval, the need to measure the states for the next iterations is eliminated. The convergence of the adaptive iterative process to the optimal controller has been presented and proved in a theorem.
Full-Text [PDF 796 kb]   (126 Downloads)    
Type of Article: Research paper | Subject: Special
Received: 2023/10/8 | Accepted: 2024/02/1 | ePublished ahead of print: 2024/02/14 | Published: 2024/02/20

References
1. [ ] M. Ven, "Input-to-State Stability for bilinear systems," MS thesis. University of Twente, 2020.
2. [2] R.R Mohler, And A.Y. Khapalov, "Bilinear control and application to flexible ac transmission systems. Journal of Optimization Theory and Applications," 105, pp. 621-637, 2000. [DOI:10.1023/A:1004645224313]
3. [3] D. Williamson, "Observation of bilinear systems with application to biological control," Automatica, 13(3), pp. 243-254, 1977. [DOI:10.1016/0005-1098(77)90051-6]
4. [4] O. Balatif, I. Abdelbaki, M. Rachik, and Z. Rachik, "Optimal control for multi-input bilinear systems with an application in cancer chemotherapy," International Journal of Scientific and Innovative Mathematical Research (IJSIMR), 3(2), pp. 22-31, 2015.
5. [5] D. Gao, Q. Yang, M. Wang and Y. Yu, "Feedback linearization optimal control approach for bilinear systems in CSTR chemical reactor," Intelligent Control and Automation, 3(03), p. 274, 2012. [DOI:10.4236/ica.2012.33031]
6. [6] M.V. Basin and M.A.A. García, "Optimal filtering for bilinear system states and its application to terpolymerization process identification'. Applied Mathematics E-Notes, 4, pp. 7-15, 2004.
7. [7] T. Naik, "Uncertainty propagation in bilinear and polynomial system for probabilistic threshold detection," Master Thesis, Delf University of Technology, 2021.
8. [8] P.M.S. Burt and J.H. de Morais Goulart, "Efficient computation of bilinear approximations and volterra models of nonlinear systems," IEEE Transactions on Signal Processing, 66(3), pp. 804-816, 2017. [DOI:10.1109/TSP.2017.2777391]
9. [9] F.L. Lewis, D.L. Vrabie, and V.L. Syrmos, "Reinforcement learning and optimal adaptive control. Optimal Control," Third Edition, John Wiley & Sons, Inc., Hoboken, NJ, USA, 2012. [DOI:10.1002/9781118122631]
10. [10] D.E. Kirk, Optimal control theory: An introduction. Courier Corporation, 2004.
11. [11] W.A. Cebuhar, and V. Costanza, "Approximation procedures for the optimal control of bilinear and nonlinear systems," Journal of Optimization Theory and Applications, 43, pp. 615-627,1984. [DOI:10.1007/BF00935009]
12. [12] Z. Aganovic, and Z. Gajic, "Successive approximation procedure for steady-state optimal control of bilinear systems," Journal of optimization theory and applications, 84, pp. 273-291. 1995. [DOI:10.1007/BF02192115]
13. [13] M. Ekman, Modeling and control of bilinear systems: application to the activated sludge process. Diss. Acta Universitatis Upsaliensis, 2005.
14. [14] H.Wang, M. Zhu, W. Hong, C. Wang, W. Li, G.Tao, and Y. Wang, "Network-wide traffic signal control using bilinear system modeling and adaptive optimization," IEEE Transactions on Intelligent Transportation Systems, 24(1), pp.79-91, 2022. [DOI:10.1109/TITS.2022.3215537]
15. [15] S. Bichiou, M.K. Bouafoura, and N. Benhadj Braiek, "Time optimal control laws for bilinear systems," Mathematical Problems in Engineering, 2018. [DOI:10.1155/2018/5217427]
16. [16] D. Gao, Q. Yang, M. Wang, and Y. Yu, "Feedback linearization optimal control approach for bilinear systems in CSTR chemical reactor," Intelligent Control and Automation, 3(03), pp. 274-277, 2012. [DOI:10.4236/ica.2012.33031]
17. [17] X. Yang, H. He, D. Liu, and Y. Zhu, "Adaptive dynamic programming for robust neural control of unknown continuous‐time non‐linear systems," IET Control Theory & Applications, 11(14), pp. 2307-2316, 2017. [DOI:10.1049/iet-cta.2017.0154]
18. [18] Y. Wen, J. Si, A. Brandt, X. Gao, and H.H. Huang, "Online reinforcement learning control for the personalization of a robotic knee prosthesis," IEEE Transactions on Cybernetics, 50(6), pp. 2346-2356, 2019. [DOI:10.1109/TCYB.2019.2890974]
19. [19] T. Tan, F. Bao, Y. Deng, A. Jin, Q. Dai, and J. Wang, Cooperative deep reinforcement learning for large-scale traffic grid signal control. IEEE transactions on cybernetics, 50(6), pp. 2687-2700, 2019. [DOI:10.1109/TCYB.2019.2904742]
20. [20] J.J. Murray, C.J. Cox, G.G. Lendaris, and R. Saeks, "Adaptive dynamic programming," IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 32(2), pp. 140-153, 2002. [DOI:10.1109/TSMCC.2002.801727]
21. [21] D. Vrabie, "Online adaptive optimal control for continuous-time systems", 2010.
22. [22] D. Vrabie, O. Pastravanu, M. Abu-Khalaf, and F.L. Lewis, "Adaptive optimal control for continuous-time linear systems based on policy iteration," Automatica, 45(2), pp. 477-484, 2009. [DOI:10.1016/j.automatica.2008.08.017]
23. [23] L.B. Prasad, H.O. Gupta, and B. Tyagi, "Application of policy iteration technique based adaptive optimal control design for automatic voltage regulator of power system," International Journal of Electrical Power & Energy Systems, 63, pp. 940-949, 2014. [DOI:10.1016/j.ijepes.2014.06.057]
24. [24] D. Vrabie and F.L. Lewis, "Adaptive optimal control algorithm for continuous-time nonlinear systems based on policy iteration," In 2008 47th IEEE Conference on Decision and Control, pp. 73-79, IEEE, 2008. [DOI:10.1109/CDC.2008.4738955]
25. [25] S. He, H. Fang, M. Zhang, F. Liu, and Z. Ding, "Adaptive optimal control for a class of nonlinear systems: The online policy iteration approach," IEEE transactions on neural networks and learning systems, 31(2), pp. 549-558, 2019. [DOI:10.1109/TNNLS.2019.2905715]
26. [26] Y. Jiang, and Z.P. Jiang, "Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics," Automatica, 48(10), pp. 2699-2704, 2012. [DOI:10.1016/j.automatica.2012.06.096]
27. [27] K. Zhang, and S.L. Ge, "Adaptive optimal control with guaranteed convergence rate for continuous-time linear systems with completely unknown dynamics," IEEE Access, 7, pp. 11526-11532, 2019. [DOI:10.1109/ACCESS.2019.2892427]
28. [28] Z. Shi, and Z. Wang, "Adaptive output-feedback optimal control for continuous-time linear systems based on adaptive dynamic programming approach" Neurocomputing, 438, pp. 334-344, 2021. [DOI:10.1016/j.neucom.2021.01.070]
29. [29] M. Gan, J. and C. Zhang, "Extended adaptive optimal Zhao, control of linear systems with unknown dynamics using adaptive dynamic programming," Asian Journal of Control, 23(2), pp. 1097-1106, 2021. [DOI:10.1002/asjc.2243]
30. [30] Q. Wei, L. Zhu , R. Song, P. Zhang, D. Liu, and J. Xiao, "Model-free adaptive optimal control for unknown nonlinear multiplayer nonzero-sum game," IEEE Transactions on Neural Networks and Learning Systems, 33(2), pp. 879-892, 2020. [DOI:10.1109/TNNLS.2020.3030127]
31. [31] D. Xu, Q.Wang and Y. Li, "Adaptive optimal control approach to robust tracking of uncertain linear systems based on policy iteration," Measurement and Control, 54(5-6), pp. 668-680, 2021. [DOI:10.1177/00202940211007177]
32. [32] J. Zhang, H. Zhang, Z.Liu, and Y. Wang, "Model-free optimal controller design for continuous-time nonlinear systems by adaptive dynamic programming based on a pre-compensator," ISA Transactions, 57, pp. 63-70, 2015. [DOI:10.1016/j.isatra.2014.08.018]
33. [33] Z.Yuan, and J. Cortés. "Data-driven optimal control of bilinear systems," IEEE Control Systems Letters, 6, pp. 2479-2484, 2022. [DOI:10.1109/LCSYS.2022.3164983]
34. [34] B. Iben Warrad, M.K. Bouafoura and N.Benhadj Braiek, "Combined constrained robust least squares approach and block-pulse functions technique for tracking control synthesis of uncertain bilinear systems with multiple time-delayed states under bounded input control," Mathematical Problems in Engineering, 2020, pp. 1-28, 2020. [DOI:10.1155/2020/7186928]
35. [35] D. Goswami, and D.A. Paley, "Bilinearization, reachability, and optimal control of control-affine nonlinear systems: A Koopman spectral approach," IEEE Transactions on Automatic Control, 67(6), pp. 2715-2728, 2021. [DOI:10.1109/TAC.2021.3088802]
36. [36] B. Luo, and H.N. Wu, "Online adaptive optimal control for bilinear systems," In 2012 American Control Conference (ACC), pp. 5507-5512, IEEE, June 2012.
37. [37] R. Longchamp, "Controller design for bilinear systems," IEEE Transactions on Automatic Control, 25(3), pp. 547-548.1980. [DOI:10.1109/TAC.1980.1102382]
38. [38] I. Derese, and E.Noldus, "Design of linear feedback laws for bilinear systems," International Journal of Control, 31(2), pp. 219-237. 1980. [DOI:10.1080/00207178008961039]
39. [39] A. Benallou, D.A Mellichamp, and D.E. Seborg, "Optimal stabilizing controllers for bilinear systems," International Journal of Control, 48(4), pp. 1487-1501, 1988. [DOI:10.1080/00207178808906264]
40. [40] J. Brewer, "Kronecker products and matrix calculus in system theory," IEEE Transactions on Circuits and Systems, 25(9), pp. 772-781. 1978. [DOI:10.1109/TCS.1978.1084534]
41. [41] D. Kleinman, "On an iterative technique for Riccati equation computations," IEEE Transactions on Automatic Control, 13(1), pp. 114-115. 1968 [DOI:10.1109/TAC.1968.1098829]
42. [42] A. Al-Tamimi, F.L. Lewis and M. Abu-Khalaf, "Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control," Automatica, 43(3), pp. 473-481. 2007. [DOI:10.1016/j.automatica.2006.09.019]
43. [43] X. Feng, and Z. Zhang, "The rank of a random matrix," Applied mathematics and computation, 185(1), pp. 689-694. 2007. [DOI:10.1016/j.amc.2006.07.076]
44. [44] H. Modares, F.L. Lewis, and M.B.N. Sistani, "Online solution of nonquadratic two‐player zero‐sum games arising in the H∞ control of constrained input systems," International Journal of Adaptive Control and Signal Processing, 28(3-5), pp. 232-254. 2014. [DOI:10.1002/acs.2348]

Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.