Code-base for the paper Spectral Normalisation for Deep Reinforcement Learning: An Optimisation Perspective. Code developed together with @tudor-berariu.
For installing OpenAI Gym
you will probably need some dependencies:
apt install apt install -y libglu1-mesa-dev libgl1-mesa-dev libosmesa6-dev xvfb ffmpeg curl patchelf libglfw3 libglfw3-dev cmake zlib1g zlib1g-dev swig
MinAtar requires manual installation, check the instructions.
Use conda install -f environment.yml
to create a conda env with all the required packages to run experiments. This will install PyTorch and some other dependencies.
The project uses liftoff to run experiments. Running a single experiment with one of the available config files is done with the command:
liftoff online.py configs/dqn_minatar.yaml
Running multiple experiments at once is possible when using liftoff
. In the ./config
folder there are multiple experiments already configured. For example ./configs/minatar_dqn_mega_redux
can be used to generate all the various combinations of learning rates, epsilon values and normalizations. The following command will simulate the generation of in excess of 12,000
experiments (~6000
experiments x2
seeds):
liftoff-prepare configs/minatar_dqn_mega_redux --runs-no 2
To actually generate the config files on disk simply append --do
to the above command:
liftoff-prepare configs/minatar_dqn_mega_redux --runs-no 2 --do
You can now use liftoff to start a heap of experiments to be run. For example the following command will randomly pick 8
experiments from the config files generated at the previous step and launch them. Once once of the experiments finishes, another one is launched and so on untill all of the configured experiments will be executed.
liftoff online.py ./results/TIMESTAMP_minatar_dqn_mega_redux/ --procs-no 8
Depending on the configuration file, some experiments will run on GPU. In this case the --per-gpu
argument control how many experiment runs can be executed on a single GPU. The command below instructs liftoff to use the first two GPUs, execute at most 4
experiment runs per GPU for a total of 8
concurrent experiment runs.
liftoff online.py ./results/TIMESTAMP_minatar_dqn_mega_redux/ --gpus 0 1 --per-gpu 4 --procs-no 8
You can use liftoff status
or liftoff status --all
for monitoring currently running experiments or for getting a summary of all the past experiments.