DATA-EFFICIENT MODEL-BASED REINFORCEMENT LEARNING FOR ROBOT CONTROL, 211-218.

doi:10.2316/J.2021.206-0528

DATA-EFFICIENT MODEL-BASED REINFORCEMENT LEARNING FOR ROBOT CONTROL, 211-218.

Ming Sun,∗ Yue Gao,∗∗ Wei Liu,∗ and Shaoyuan Li∗

References

[1] H. Hasselt, A. Guez, and D. Silver, Deep reinforcement learningwith double q-learning, Thirtieth AAAI Conf. on ArtiﬁcialIntelligence, Phoenix, AZ, 2016.
[2] S. Levine, C. Finn, T. Darrell, et al., End-to-end trainingof deep visuomotor policies, Journal of Machine LearningResearch, 17(1), 2016, 1334–1373.
[3] T. Yan, W. Zhang, S.X. Yang, et al., Soft actor-critic reinforce-ment learning for robotic manipulator with hindsight experi-ence replay, International Journal of Robotics and Automation,34(5), 2019, 536–543.
[4] M. Sadeghzadeh, D. Calvert, and H.A. Abdullah, Autonomousvisual servoing of a robot manipulator using reinforcementlearning, International Journal of Robotics and Automation,31(1), 2016, 26–38.
[5] X.B. Peng, G. Berseth, K.K. Yin, et al., Deeploco: Dynamiclocomotion skills using hierarchical deep reinforcement learning,ACM Transactions on Graphics, 36(4), 2017, 1–13.
[6] Y. Liu, M. Cong, H. Dong, et al., Reinforcement learning andEGA-based trajectory planning for dual robots, InternationalJournal of Robotics and Automation, 33(4), 2018, 367–378.217
[7] T.P. Lillicrap, J.J. Hunt, A. Pritzel, et al., Continuous controlwith deep reinforcement learning, Int. Conf. on LearningRepresentations, San Juan, Puerto Rico, 2016.
[8] J. Schulman, F. Wolski, P. Dhariwal, et al., Proximal policyoptimization algorithms, arXiv:1707.06347, 2017.
[9] L. Pinto, J. Davidson, R. Sukthankar, et al., Robust adversarialreinforcement learning, Proceedings of the 34th Int. Conf. onMachine Learning, Sydney, Australia, 2017.
[10] A. Nagabandi, G. Kahn, R.S. Fearing,et al., Neural networkdynamics for model-based deep reinforcement learning withmodel-free ﬁne-tuning, Int. Conf. on Robotics and Automation,Brisbane, Australia, 2018.
[11] S. Ross and J.A. Bagnell, Agnostic system identiﬁcation formodel-based reinforcement learning, Int. Conf. on MachineLearning, Edinburgh, Scotland, 2012.
[12] S. Levine and V. Koltun, Guided policy search, Int. Conf. onMachine Learning, Bari, Italy, 2013.
[13] S. Levine and P. Abbeel, Learning neural network policieswith guided policy search under unknown dynamics, Advancesin Neural Information Processing Systems, Montreal, Quebec,2014.
[14] M.P. Deisenroth and C.E. Rasmussen, PILCO: A model-basedand data-eﬃcient approach to policy search, Proc. of the 28thInt. Conf. on Machine Learning, Washington, DC, 2011.
[15] E. Kaiser, J.N. Kutz, and S.L. Brunton, Sparse identiﬁcationof nonlinear dynamics for model predictive control in the low-data limit, Proceedings of the Royal Society A, 474(2219),2018, 1–25.
[16] H. Schaeﬀer, Learning partial diﬀerential equations via datadiscovery and sparse optimization, Proceedings of the RoyalSociety A, 473(2197), 2017, 1–20.
[17] J.C. Loiseau, B.R. Noack, and S.L. Brunton, Sparse reduced-order modelling: sensor-based dynamics to full-state estima-tion, Journal of Fluid Mechanics, 844, 2018, 459–490.
[18] R.S. Sutton, Dyna, an integrated architecture for learning,planning, and reacting, ACM Sigart Bulletin, 2(4), 1991,160–163.
[19] B. Bischoﬀ, D. Nguyen-Tuong, H. Hoof, et al., Policy searchfor learning robot control using sparse data, Int. Conf. onRobotics and Automation, Hong Kong, China, 2014.
[20] D. Bruder, C.D. Remy, and R. Vasudevan, Nonlinear systemidentiﬁcation of soft robot dynamics using Koopman operatortheory, Int. Conf. on Robotics and Automation, Montreal, QC,2019.
[21] Z. Gai, D. Liu, F. Chang, et al., Abnormal crowd behaviourdetection based on deep learning and sparse representation,International Journal of Robotics and Automation, 35(4), 2020,322–331.
[22] R. Chinniah and S.S. Rani, A sparse based rain removal algo-rithm for image sequences, International Journal of Roboticsand Automation, 29(4), 2014, 441–447.
[23] N. Kalouptsidis, G. Mileounis, B. Babadi, et al., Adaptivealgorithms for sparse system identiﬁcation, Signal Processing,91(8), 2011, 1910–1919.
[24] Y. Wang, X. Yan, M. Jiang, et al., 3D non-rigid structure frommotion based on sparse approximation in trajectory space,International Journal of Robotics and Automation, 33(2), 2018,111–117.
[25] S.L. Brunton, J.L. Proctor, and J.N. Kutz, Discovering gov-erning equations from data: Sparse identiﬁcation of nonlin-ear dynamical systems, Proceedings of the National Academyof Sciences of the United States of America, 113(15), 2016,3932–3937.
[26] S.L. Brunton, J.L. Proctor, and J.N. Kutz, Sparse identiﬁcationof nonlinear dynamics with control (SINDYc), IFAC Symp. onNonlinear Control Systems, CA, USA, 2016.
[27] C. Devin, A. Gupta, T. Darrell, et al., Learning modular neuralnetwork policies for multi-task and multi-robot transfer, IEEEInt. Conf. on Robotics and Automation, Singapore, 2017.

Important Links:

Abstract
DOI: 10.2316/J.2021.206-0528
From Journal (206) International Journal of Robotics and Automation - 2021

Go Back