Sample and time efficient policy learning with CMA-ES and Bayesian Optimisation

Research Output

In evolutionary robot systems where morphologies and controllers of real robots are simultaneously evolved, it is clear that there is likely to be requirements to refine the inherited controller of a 'newborn' robot in order to better align it to its newly generated morphology. This can be accomplished via a learning mechanism applied to each individual robot: for practical reasons, such a mechanism should be both sample and time-efficient. In this paper, We investigate two ways to improve the sample and time efficiency of the well-known learner CMA-ES on navigation tasks. The first approach combines CMA-ES with Novelty Search, and includes an adaptive restart mechanism with increasing population size. The second bootstraps CMA-ES using Bayesian Optimisation, known for its sample efficiency. Results using two robots built with the ARE project's modules and four environments show that novelty reduces the number of samples needed to converge, as does the custom restart mechanism; the latter also has better sample and time efficiency than the hybridised Bayesian/Evolutionary method.

Date:

14 July 2020
Publication Status:

Published
DOI:

10.1162/isal_a_00299
Funders:

Engineering and Physical Sciences Research Council

http://researchrepository.napier.ac.uk/output/2675888 <p>Le Goff, L. K., Buchanan, E., Hart, E., Eiben, A. E., Li, W., De Carlo, M., …Tyrrell, A. M. (2020). Sample and time efficient policy learning with CMA-ES and Bayesian Optimisation. In <i>ALIFE 2020: The 2020 Conference on Artificial Life</i>. , (432-440). https://doi.org/10.1162/isal_a_00299</p>

Citation

Le Goff, L. K., Buchanan, E., Hart, E., Eiben, A. E., Li, W., De Carlo, M., …Tyrrell, A. M. (2020). Sample and time efficient policy learning with CMA-ES and Bayesian Optimisation. In ALIFE 2020: The 2020 Conference on Artificial Life. , (432-440). https://doi.org/10.1162/isal_a_00299