Python Implementation of the Fourier Basis

This software contains:

A Python implementation of the Fourier Basis, in fourier_fa.py.
An example Sarsa(λ) with linear function approximation implementation, in sarsa_lambda_linear.py.
Two example RLGym agents that use Sarsa(λ) with the Fourier basis, in mc_sarsa_lambda.py (Mountain Car) and acro_sarsa_lambda.py (Acrobot).

The first two modules require NumPy, and the two example agents require both Matplotlib and RLGym (tested on RLGym 0.18.0).

Author: George Konidaris, Brown University

License: GPL 3

Overview

The Fourier basis is a linear value function approximation scheme for reinforcement learning tasks with continuous state spaces, where the value function V(s) is approximated as a weighted sum of Fourier basis functions φ:

V(s) = Σ_i w_i φ_i(s).

A similar scheme is used to approximate the Q-function Q(s, a).

The above equation is linear in the weights w_i, which are set via learning. The Fourier basis uses Fourier terms for the basis functions φ_i:

φ_i(s) = cos(π c_i . s),

where each c_i is a coefficient vector (of the same size as the state vector s), each element of which is an integer between 0 and some fixed upper bound, called the order. The basis functions are constructed by generating all such coefficient vectors. For state dimension d and order n, there are (n+1)^d such coefficient vectors, which makes the Fourier basis (like all linear bases) suitable only for small problems (typically d <= 7).

Note that s must be scaled so that each of its elements lies in [0, 1].

Paper

This code implements the Fourier basis as originally described in:

G.D. Konidaris, S. Osentoski, and P.S. Thomas. Value Function Approximation in Reinforcement Learning using the Fourier Basis. In Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, pages 380-385, August 2011.

If you use the Fourier basis in a research article, please cite the original paper:

@inproceedings{FourierAAAI2011,
    author = "G.D. Konidaris and S. Osentoski and P.S. Thomas",
    title = "Value Function Approximation in Reinforcement Learning 
             using the {F}ourier Basis",
    booktitle = "Proceedings of the Twenty-Fifth Conference on  
                 Artificial Intelligence",
    year = 2011,
    pages = "380-385"
}

FourierBasis class

The FourierBasis class provides methods for constructing and using a Fourier basis.

Construction

To construct a Fourier basis, you can supply an order and a dimension. For example:

fb = FourierBasis(order=3, d=2)

... creates a basis of order 3, for a 2-dimensional state space. Therefore:

print(fb.get_coefficients())

outputs:

[[0 0]
 [0 1]
 [0 2]
 [0 3]
 [1 0]
 [1 1]
 [1 2]
 [1 3]
 [2 0]
 [2 1]
 [2 2]
 [2 3]
 [3 0]
 [3 1]
 [3 2]
 [3 3]]

You can also ask for a specific coefficient:

print(fb.get_coefficient(3))

returns

[0 3]

Alternatively, for finer grained control over the basis you can supply a list of orders, and the dimension will be set to the length of the list:

fb = fourier_fa.FourierBasis(order=[3, 1])

... creates a Fourier basis up to order 3 on the first state variable, and up to order 1 on the second state variable. Now

print(fb.get_coefficients())

returns

[[0 0]
 [0 1]  
 [1 0]
 [1 1]
 [2 0]
 [2 1]
 [3 0]
 [3 1]]

Additionally, in either case you can use the independent boolean parameter (which defaults to False) to construct a basis with independent Fourier coefficients (i.e., where only one coefficient is non-zero at a time):

fb = FourierBasis(order=3, d=2, independent=True)
print(fb.coefficients)

... outputs:

[[0. 0.]
 [1. 0.]
 [2. 0.]
 [3. 0.]
 [0. 1.]
 [0. 2.]
 [0. 3.]]

An independent basis is much smaller that a dependent basis, but is a much more restricted function approximation class.

Evaluation

A Fourier basis instance can be evaluated by passing a state into the 'evaluate' function as follows:

fb = FourierBasis(order=1, d=2)
fb.evaluate([0.2, 0.7])

... returns a numpy array with a real value for each basis function (i.e., coefficient). In this case, the array contains 4 values:

array([ 1.        , -0.58778525,  0.80901699, -0.95105652])

These outputs are intended to serve as the feature values for a linear function approximator (i.e., multiplied by weights and then summed).

Gradient Factors

When using the Fourier basis in a gradient descent framework, the original paper suggests scaling the gradients by one over the Euclidean norm of the coefficient vector (setting the all-zero coefficient factor to 1). These factors can be accessed using get_gradient_factors and get_gradient_factor.

SarsaLambdaLinearFA class and Examples

The SarsaLambdaLinearFA class is a simple linear online Sarsa(λ) implementation; example domain implementations for Acrobot (acro_sarsa_lambda.py) and Mountain Car (mc_sarsa_lambda.py) for reference.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md
acro_sarsa_lambda.py		acro_sarsa_lambda.py
fourier_fa.py		fourier_fa.py
gpl-3.0.txt		gpl-3.0.txt
mc_sarsa_lambda.py		mc_sarsa_lambda.py
requirements.txt		requirements.txt
sarsa_lambda_linear.py		sarsa_lambda_linear.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Python Implementation of the Fourier Basis

Overview

Paper

FourierBasis class

Construction

Evaluation

Gradient Factors

SarsaLambdaLinearFA class and Examples

About

Releases

Packages

Languages

License

brownirl/fourier_vfa

Folders and files

Latest commit

History

Repository files navigation

Python Implementation of the Fourier Basis

Overview

Paper

FourierBasis class

Construction

Evaluation

Gradient Factors

SarsaLambdaLinearFA class and Examples

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages