This is a React-based simulator for the k-armed bandit problem, designed to test and visualize reinforcement learning strategies with an interactive UI styled using Tailwind CSS.
- React - A JavaScript library for building user interfaces.
- Tailwind CSS - A utility-first CSS framework for rapid UI development.
First, ensure you have Node.js installed to manage your project's dependencies.
Clone the project and install dependencies:
git clone https://github.com/mweglowski/bandit_demonstration.git
cd bandit_demonstration
npm install
To run the application in development mode:
npm start
This will open the simulator in your default web browser. For production builds, you can use:
npm run build
- Multiple bandits with unique probabilistic reward distributions.
- Interactive interface for 'pulling' bandit arms, built with React.
- Responsive and modern UI using Tailwind CSS.
- Visualization of action counts and estimated values.
The simulator focuses on the ε-greedy strategy, balancing exploration and exploitation by selecting the best-known action with probability 1−ϵ and exploring a random action with probability ϵ.
The simulator updates the estimated action value Q using the formula:
Q(n+1) = Q(n) + (1/n) * (Rn - Q(n))
Where:
Q(n+1)
is the new estimate,Q(n)
is the current estimate,Rn
is the reward received,n
is the number of times the action has been chosen.
After launching the simulator, interact with the UI by selecting a bandit to 'pull'. Observe the algorithm's performance and how estimated values update based on the reward distributions.
Explore the simulator online at https://bandit-problem-simulator.vercel.app/.
I welcome contributions! If you have suggestions or are interested in improving the k-armed bandit simulator, please feel free to fork the repository, make changes, and submit a pull request.
Inspired by the foundational reinforcement learning work of Sutton and Barto.