async
- just a plain Actor Criticdqn
- plain DQNetworkfun
- implementation of FeUdal Networks for Hierarchical Reinforcement Learning (https://arxiv.org/abs/1703.01161)ga3c
- implementation of GA3C: Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU (not finished)meta_bandits
- meta learning experiments using bandit environments (2 arms dependent, 2 arms independent and 11 arms)meta_mdp
- meta learning experiments using a simple MDP environment