Evolving neural structure for snake control
The possibility to achieve minimal agent intelligence when using evolutionary strategies to optimize the weights of their neural structure has been confirmed. The resulting behavior demonstrates the operability of the written infrastructure. This indicates that it can be used to test the effectiveness of future neuroevolution methods.
I am impressed by Google Research's work on Weight Agnostic Neural Networks and how to optimize them. For better understanding, I want to implement my findings in the chosen environment: a snake game. Below you can see the main milestones of the project. When the project is close to completion, I will present visual demonstrations of the behavior of the agents being created and describe more project details.
- Create an environment for simulations;
- find ways to encode information about the game state;
- implement them;
- Write an optimization procedure based on evolutionary strategies for tests;
- Test the optimizer on a simple agent;
- Conduct experiments and find out how much intelligent behavior can be achieved from an agent with a small number of neurons;
- This point was not originally planned, but it turned out that the task of surviving in the environment can be conditionally divided into two smaller ones: perception of the environment and decision-making. Evolution solves its tasks too slowly, and the project is mainly focused on the second part of the task. It was decided to develop an analogue of visual cortex to speed up optimization - the researchers in the original paper also paid attention to this and opted to use Variational Autoencoder for the solution. I will follow their example;
- Before engaging in neuroevolution, consider reducing the impact of the sparse reward problem; solved(?) by using the curiosity mechanism
- another problem arose: the relative scale of the rewards. Short description: the rewards for a newly explored state should not be as large as for a block of food eaten. But what is the right ratio? 5 new states and 1 block of food? 10? So far I'll create a convenient table for future tests, but right now I'll focus on higher priorities;
- Write a basic neuroevolution functionality;
- Implement a way to visualize the neural architectures created within the neuroevolution system;
- Create animations of agent behavior and relate them to their neural architectures;
- Devise ways to present the work accomplished in an attractive manner.
Agents are presented in order of increasing loss-function: from left to right from top to bottom.
iter_584 |
iter_148 |
iter_392 |
iter_36 |
---|---|---|---|
iter_384 |
iter_404 |
iter_935 |
iter_967 |
iter_910 |
iter_943 |
iter_562 |
iter_561 |
The number of neurons is indicated in parentheses.
(8) iter_40 |
(8) iter_101 |
(16) iter_403 |
(16) iter_426 |
---|---|---|---|
(32) iter_301 |
(32) iter_457 |
(32) iter_902 |
(32) iter_1011 |
(32) iter_1084 |
(32) iter_1513 |
(64) iter_1395 |
(64) iter_1940 |
Special thanks for Adam Gaier and David Ha