Influence of the Context of a Reinforcement Learning Technique on Learning Performances - A Case Study

F. Davesne and C. Barret (France)


Machine Learning, Context Quality, State Design Testing, Shannon Entropy.


Statistical learning methods select the model that sta tistically best fit the data, given a cost function. In this case, learning means finding out a set of internal parameters of the model that minimize (or maximize) the cost function. As an example of such a procedure, reinforcement learning techniques (RLT) may be used in robotics to find the best mapping between sensors and effectors to achieve a goal. A lot of practical issues have been already pointed out to apply RLT in real robotics, and some solutions have been investigated. However, an underlying issue, which is criti cal for the reliability of the task accomplished by the robot, is the adequacy of the a priori knowledge (design of the states, value of the temperature parameter) used by the RLT with the physical properties of the robot, in order to achieve the goal defined by the experimenter. We call it Context Quality (CQ). Some work has pointed out that bad CQ may lead to poor learning results, but CQ in itself was not really quantified. In this paper, we suggest that the entropy measure taken from the Information Theory is well suited to quantify CQ and to predict the quality of the results obtained by the learning process. Taking the Cart Pole Balancing bench mark, we show that there exists a strong relation between our CQ measure and the performance of the RLT, that is to say the viability duration of the cart/pole. In particular, we investigate the influence of the noisiness of the inputs and the design of the states. In the first case, we show that CQ is linked to performance of recognition of the input states by the system. Moreover, we propose an statistical explana tory model of the influence of CQ on the RLT performance.

Important Links:

Go Back