Improving Discrete-valued Q-learning with Fuzzy Sets to Obtain a Continuous Goal-reaching Behavior

G. Cicirelli, T.D`Orazio, and A. Distante (Italy)


Reinforcement Learning, State and Action Definition, Learning Time, Fuzzy Variables.


In this paper we describe a learning algorithm that re alizes a goal-reaching behavior for an autonomous vehicle. The robot has to reach a door from every position of the environment. The state of the system is based on visual information received by a TV-camera placed on the mobile robot. The vision algorithm is able to determine the relative position of the vehicle with respect to the door according to the shape information of the door. A Q-learning algorithm has been used to generate the optimal state-action associ ations. The problem of defining the state and the action sets has been addressed with the aim of producing smooth paths, of reducing the effects of visual errors during real navigation, and of keeping low the computational cost dur ing the learning phase. A novel way to obtain a continuous action set has been introduced: it uses a fuzzy model to evaluate the system state. Experiments in simulation have been carried out in order to prove how advantageous is the introduction of fuzzy control for state and action evalua tion with respect to policies with more states and actions. Finally the learned policy has been transferred on a real robot. Experiments in real environment show the general ity of the learned knowledge.

Important Links:

Go Back