RSMDP-BASED ROBUST Q-LEARNING FOR OPTIMAL PATH PLANNING IN A DYNAMIC ENVIRONMENT

doi:10.2316/Journal.206.2016.4.206-4255

RSMDP-BASED ROBUST Q-LEARNING FOR OPTIMAL PATH PLANNING IN A DYNAMIC ENVIRONMENT

Yunfei Zhang, Weilin Li, and Clarence W. de Silva

References

[1] H.M. Choset, (ed.), Principles of robot motion: Theory,algorithms, and implementation (Cambridge, Massachusetts:MIT Press, June 2005).
[2] S.M. LaValle, Planning algorithms (Cambridge: CambridgeUniversity Press, 2006).
[3] J. Pineau, M. Montemerlo, M. Pollack, N. Roy, and S. Thrun, Probabilistic control of human robot interaction: Experiments with a robotic assistant for nursing homes, Proceedings of the 2nd IARP/IEEE-RAS Joint Workshop on Technical Challenge for Dependable Robots in Human Environments, pp. 11–19, 2002.
[4] R. Luna, I.A. Sucan, M. Moll, and L.E. Kavraki, Anytimesolution optimization for sampling-based motion planning,Proceedings of the IEEE International Conference in Robotics and Automation, Karlsruhe, Germany, May 6–10, 2013.
[5] E.E. Kavraki, M.N. Kolountzakis, and J.C. Latombe, Analysis of probabilistic roadmaps for path planning, IEEE Transactions on Robotics and Automation, 14, 166–171, 1998.
[6] L.E. Kavraki, P. Svestka, J.C. Latombe, and M.H. Overmars, Probabilistic roadmaps for path planning in high-dimensional conﬁguration spaces, IEEE Transactions on Robotics and Automation, 12(4), 566–580, 1996.
[7] S. Karaman, M.R. Walter, A. Perez, E. Frazzoli, and S. Teller, Anytime motion planning using the RRT, IEEE International Conference on Robotics and Automation, Shanghai, China, May 9–13, 2011.
[8] J. Van den Berg, D. Ferguson, and J. Kuﬀner, Anytime path planning and replanning in dynamic environments, Proceedings of IEEE International Conference on Robotics and Automation, pp. 2366–2371, Orlando, FL, May 15–19, 2006.
[9] J. Van den Berg and M. Overmars, Planning time-minimal safe paths amidst unpredictably moving obstacles, International Journal of Robotics Research, 27(11–12), 1274–1294, 2008.
[10] R.S. Sutton and A.G. Barto, Reinforcement learning: Anintroduction (Cambridge: MIT Press, 1998).
[11] R. Alterovitz, T. Sim´eon, and K.Y. Goldberg, The stochastic motion roadmap: A sampling framework for planning with Markov motion uncertainty, Proceedings of Robotics: Science and Systems, June 2007.
[12] V.A. Huynh, S. Karaman, and E. Frazzoli, An incremental sampling based algorithm for stochastic optimal control, IEEE International Conference on Robotics and Automation, Saint Paul, MN, May 14–18, 2012.
[13] G. Yin, V. Krishnamurthy, and C. Ion, Regime switchingstochastic approximation algorithms with application to adaptive discrete stochastic optimization, SIAM Journal on Optimization, 14(4), 1187–1215, 2004.
[14] A. Costa and F.J. V´azquez-Abad, Adaptive stepsize selection for tracking in a regime-switching environment, Automatica, 43(11), 1896–1908, 2007.
[15] R. Brooks and T. Lozano-Perez, A subdivision algorithm in conﬁguration space for ﬁnd path with rotation, Proceedings of International Joint Conference on Artiﬁcial Intelligence, pp. 799–806, August 1983.
[16] O. Khatib, Real-time obstacle avoidance for manipulators and mobile robots. The International Journal of Robotics Research, 5, 90–98, 1986.
[17] Y. Koren and J. Borenstein, Potential ﬁeld methods andtheir inherent limitations for mobile robot navigation, IEEE International Conference on Robotics and Automation, Saint Sacramento, CA, April 9–11, 1991.
[18] L. Zeng and G.M. Bone, Mobile robot navigation for moving obstacles with unpredictable direction changes, including humans, Advanced Robotics, 26(16), 1841–1862, 2012.
[19] L. Kavraki and J.C. Latombe, Randomized preprocessing of conﬁguration space for fast path planning, IEEE International Conference on Robotics and Automation, San Diego, CA, May 8–13, 1994.
[20] F. Von Hundelshausen, M. Himmelsbach, F. Hecker, A. Mueller, and H.J. Wuensche, Driving with tentacles – Integral structures of sensing and motion, Journal of Field Robotics, 25, 640–673,2008.
[21] S. Quinlan and O. Khatib, Elastic bands: connecting path planning and control, IEEE International Conference on Robotics and Automation, Atlanta, GA, May 2–6, 1993.
[22] J. Minguez, F. Lamiraux, and J.P. Laumond, Motion planning and obstacle avoidance, in B. Siciliano and O. Khatib (Eds.), Springer handbook of robotics (Berlin, Heidelberg: Springer, 2008), 827–852.
[23] T. Wada and S. Hiraoka, A deceleration control method of automobile for collision avoidance based on driver’s perceptual risk, IEEE/RSJ International Conference on Intelligent Robots and Systems, St. Louis, MO, Oct. 10–15, 2009.
[24] A. Ohya, A. Kosaka, and A. Kak, Vision-based navigation by a mobile robot with obstacle avoidance using single-camera vision and ultrasonic sensing, IEEE Transactions on Robotics and Automation, 14(6), 969–978, 1998.
[25] F. Lamiraux, D. Bonnafous, and O. Lefebvre, Reactive path deformation for nonholonomic mobile robots, IEEE Transactions on Robotics, 20, 967–977, 2004.
[26] L. Lapierre, R. Zapata, and B. Jouvencel, Concurrent path following and obstacle avoidance control of a unicycle-type robot, Advances in Vehicle Control and Safety, Buenos Aires, Argentina, 2007.
[27] M.W. Spong, S. Hutchinson, and M. Vidyasagar, Robot modeling and control (New York, NY: Wiley Press, 2006).
[28] D.P. Bertsekas, Dynamic programming and optimal control, Vol. II, 4th ed. Approximate dynamic programming (Belmont: Athena Scientiﬁc, 2012).
[29] J. Park, J. Kim, and J. Song, Path planning for a robot manipulator based on probabilistic roadmap and reinforcement learning, International Journal of Control Automation and Systems, 5(6), 2007, 674–680.
[30] L. Busoniu, R. Babuska, B. De Schutter, and D. Ernst, Reinforcement learning and dynamic programming using function approximators (Boca Raton, Florida: CRC press, 2010).
[31] M. van Otterlo and M. Wiering, Reinforcement learning and Markov decision processes, in M. van Otterlo and M. Wiering (Eds.), Reinforcement Learning (Berlin, Heidelberg: Springer, 2012), 3–42.
[32] Y. Zhang, N. Fattahi, and W. Li, Probabilistic roadmap with self-learning for path planning of a mobile robot in a dynamic and unstructured environment, IEEE International Conference on Mechatronics and Automation, pp. 1074–1079, Takamatsu, Japan, August 4–7, 2013.

Important Links:

Abstract
DOI: 10.2316/Journal.206.2016.4.206-4255
From Journal (206) International Journal of Robotics and Automation - 2016

Go Back