Evolutionary Reinforcement Learning of Neural Network Controller for Pendulum Task by Evolution Strategy

Hidehiko Okada

Evolutionary Reinforcement Learning of Neural Network Controller for Pendulum Task by Evolution Strategy

Hidehiko Okada¹

Section:Research Paper, Product Type: Journal-Paper
Vol.10 , Issue.3 , pp.13-18, Jun-2022

Online published on Jun 30, 2022

Copyright © Hidehiko Okada . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at Google Scholar | DPI Digital Library

XML View PDF Download

How to Cite this Paper

IEEE Citation
MLA Citation
APA Citation
BibTex Citation
RIS Citation

IEEE Style Citation: Hidehiko Okada, “Evolutionary Reinforcement Learning of Neural Network Controller for Pendulum Task by Evolution Strategy,” International Journal of Scientific Research in Computer Science and Engineering, Vol.10, Issue.3, pp.13-18, 2022.

MLA Style Citation: Hidehiko Okada "Evolutionary Reinforcement Learning of Neural Network Controller for Pendulum Task by Evolution Strategy." International Journal of Scientific Research in Computer Science and Engineering 10.3 (2022): 13-18.

APA Style Citation: Hidehiko Okada, (2022). Evolutionary Reinforcement Learning of Neural Network Controller for Pendulum Task by Evolution Strategy. International Journal of Scientific Research in Computer Science and Engineering, 10(3), 13-18.

BibTex Style Citation:
@article{Okada_2022,
author = {Hidehiko Okada},
title = {Evolutionary Reinforcement Learning of Neural Network Controller for Pendulum Task by Evolution Strategy},
journal = {International Journal of Scientific Research in Computer Science and Engineering},
issue_date = {6 2022},
volume = {10},
Issue = {3},
month = {6},
year = {2022},
issn = {2347-2693},
pages = {13-18},
url = {https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=2816},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=2816
TI - Evolutionary Reinforcement Learning of Neural Network Controller for Pendulum Task by Evolution Strategy
T2 - International Journal of Scientific Research in Computer Science and Engineering
AU - Hidehiko Okada
PY - 2022
DA - 2022/06/30
PB - IJCSE, Indore, INDIA
SP - 13-18
IS - 3
VL - 10
SN - 2347-2693
ER -

260 Views

236 Downloads

48 Downloads

Bar Line

Abstract :
Reinforcement learning of neural networks requires gradient-free algorithms because labeled training data are not available. Evolutionary algorithms are applicable to the reinforcement learning because the algorithms do not rely on gradients. To successfully train neural networks by evolutionary algorithms, we need to carefully choose appropriate algorithms because many algorithm variations are available. The author experimentally evaluates Evolution Strategy, an instance of evolutionary algorithms, for the reinforcement learning of neural networks. A pendulum control task is adopted in this work. Experimental results revealed that ES could successfully train an MLP so that the trained MLP could make and keep the pendulum upright quickly, if the MLP was equipped with sufficient hidden units. For the task adopted in this work, 16 units are the best among 8, 16 and 32 units in terms of the task performance and the computational efficiency. Besides, the results revealed that exploration contributes more for ES to search for better solutions than exploitation.

Key-Words / Index Term :
Evolutionary algorithm; Evolution strategy; Neural network; Neuroevolution; Reinforcement learning.

References :
[1] T. Bäck, H.P. Schwefel, “An Overview of Evolutionary Algorithms for Parameter Optimization,” Evolutionary Computation, Vol.1, No.1, pp.1-23, 1993.
[2] D.B. Fogel, “An Introduction to Simulated Evolutionary Optimization,” IEEE Transactions on Neural Networks, Vol.5, No.1, pp.3-14, 1994.
[3] T. Bäck, “Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms,” Oxford University Press, 1996.
[4] A.E. Eiben, R. Hinterding, Z. Michalewicz, “Parameter Control in Evolutionary Algorithms,” IEEE Transactions on Evolutionary Computation, Vol.3, No.2, pp.124-141, 1999.
[5] A.E. Eiben, J.E. Smith, “Introduction to Evolutionary Computing (2nd ed.),” Springer, 2015.
[6] C.J.C.H. Watkins, “Learning from Delayed Rewards,” PhD Thesis, Cambridge University, 1989.
[7] C.J.C.H. Watkins, P. Dayan, “Q-Learning,” Machine Learning, Vol.8, No.3, pp.279-292, 1992.
[8] R.S. Sutton, A.G. Barto, “Reinforcement Learning: An Introduction (2nd ed.),” MIT Press, 2018.
[9] H.P. Schwefel, “Evolution Strategies: A Family of Non-Linear Optimization Techniques based on Imitating Some Principles of Organic Evolution,” Annals of Operations Research, Vol.1, pp.165-167, 1984.
[10] H.G. Beyer, H.P. Schwefel, “Evolution Strategies: A Compre-hensive Introduction,” Journal Natural Computing, Vol.1, No.1, pp.3-52, 2002.
[11] D.E. Goldberg, J.H. Holland, “Genetic Algorithms and Machine Learning,” Machine Learning, Vol.3, No.2, pp.95-99, 1988.
[12] J.H. Holland, “Genetic Algorithms,” Scientific American, Vol.267, No.1, pp.66-73, 1992.
[13] M. Mitchell, “An Introduction to Genetic Algorithms,” MIT Press, 1998.
[14] K. Sastry, D. Goldberg, G. Kendall, “Genetic Algorithms,” Search Methodologies, Springer, pp.97-125, 2005.
[15] R. Storn, K. Price, “Differential Evolution – A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces,” Journal of Global Optimization, Vol.11, pp.341-359, 1997.
[16] K. Price, R.M. Storn, J.A. Lampinen, “Differential Evolution: A Practical Approach to Global Optimization,” Springer Science & Business Media, 2006.
[17] S. Das, P.N. Suganthan, “Differential Evolution: A Survey of the State-of-the-art,” IEEE transactions on evolutionary computation, Vol.15, No.1, pp.4-31, 2010.
[18] D.E. Rumelhart, G.E. Hinton, R.J. Williams. “Learning Internal Representations by Error Propagation,” in D.E. Rumelhart, J.L. McClelland, and the PDP research group (editors), “Parallel Distributed Processing: Explorations in the Microstructure of Cognition,” Vol.1: Foundation. MIT Press, 1986.
[19] R. Collobert, S. Bengio, “Links Between Perceptrons, MLPs and SVMs,” Proc. of the Twenty-First International Conference on Machine Learning (ICML’04), ACM, 2004.
[20] X. Yao, Y. Liu, “A New Evolutionary System for Evolving Arti?cial Neural Networks,” IEEE Transactions on Neural Networks, Vol.8, No.3, pp.694-713, 1997.
[21] N.T. Siebel, G. Sommer, “Evolutionary Reinforcement Learning of Artificial Neural Networks,” Internatinal Journal of Hybrid Intelligent Systems, Vol.4, No.3, pp.171-183. 2007.
[22] K. Chellapilla, D.B. Fogel, “Evolving Neural Networks to Play Checkers Without Relying on Expert Knowledge,” IEEE Transactions on Neural Networks, Vol.10, No.6, pp.1382-1391, 1999.
[23] L. Cardamone, D. Loiacono and P. L. Lanzi, “Evolving Competitive Car Controllers for Racing Games with Neuro-evolution,” Proc. of 11th Annual Conference on Genetic and Evolutinary Computation, pp.1179-1186, 2009.
[24] S. Risi, J. Togelius, “Neuroevolution in Games: State of the Art and Open Challenges”, IEEE Transactions on Computational Intelligence and AI in Games, Vol.9, No.1, pp.25-41, 2017.
[25] J. Togelius, S.M. Lucas, “Evolving Controllers for Simulated Car Racing,” Proc. of 2005 IEEE Congress on Evolutionary Computation, Vol.2, pp.1906-1913, 2005.

Full Paper View Go Back

Main Menu

Journals Contents

Information

Download

Publication Certificate

Contact Us

Use full Link