Constrained Policy Optimization with JAX

Constrained Policy Optimization is a safe reinforcement learning algorithm that solves constrained Markov decision processes to ensure safety. Our implementation is a port of the original OpenAI implementation to JAX.

Install

First, make sure to have a python 3.10.12 installed.

Using Poetry

poetry install

Check out the additional (optional) installation groups in pyproject.toml for additional functionality.

Without Poetry

You have two options, cloning the repository (for example, for local development and hacking) or just install it as it is, directly from github.

Clone: git clone https://github.com/lasgroup/jax-cpo.git, then cd jax-cpo and pip install -e .; or
pip install git+https://git@github.com/lasgroup/jax-cpo

Usage

Via `Trainer` class

This is the easier entry point for running experiments. A usage example here.

With your own training loop

If you just want to use our implementation with a different training/evaluation setup, you can directly use the CPO class. The only required interface is via the __call__(observation: np.ndarray, train: bool) -> np.array function. The function implements the following:

Observes the state (provided by the environment), put it in an episodic buffer for the next policy update.
At each timestep use the current policy to return an action.
Whenever the train flag is true, and the buffer is full, a policy update is triggered.

Consult configs.yaml for hyper-parameters.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
jax_cpo		jax_cpo
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
main.py		main.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Constrained Policy Optimization with JAX

Install

Using Poetry

Without Poetry

Usage

Via `Trainer` class

With your own training loop

About

Releases

Packages

Languages

License

lasgroup/jax-cpo

Folders and files

Latest commit

History

Repository files navigation

Constrained Policy Optimization with JAX

Install

Using Poetry

Without Poetry

Usage

Via Trainer class

With your own training loop

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Via `Trainer` class

Packages