Skip to content

This literature review delves into the world of multi-armed bandit problems, exploring their applications and solutions in sequential decision-making scenarios

Notifications You must be signed in to change notification settings

shashankatthaluri/Multi-arm-bandit-model-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

🎰 Multi-Armed Bandit Models 🎰

A literature review exploring the world of multi-armed bandit problems! 🤓

🤔 What is a Multi-Armed Bandit?

A bandit is a simple slot machine wherein you insert a coin into the machine, pull a lever, and get an immediate reward. In this project, the multi-armed bandit model seeks to balance exploration (gathering information) and exploitation (maximizing reward) to solve sequential decision making problems. It has applications in recommendation systems, clinical trials 👨‍⚕️, and more!

📚 What This Project Covers

This literature review examines seminal papers on multi-armed bandits that advanced the field, including:

  • The stochastic multi-armed bandit problem and Gittins indices
  • Refined lower bounds in both the fixed-confidence along with matching algorithms for Gaussian and Bernoulli bandit models.
  • Upper confidence bound 📈 algorithms
  • Best arm identification problems
  • Contextual/linear 🔢 bandits
  • Thompson sampling 🎯

📖 Key Papers Reviewed

Papers reviewed and summarized include work by:

  • Kaufmann et al
  • Vicotr Gabbillon
  • Shivaram Kalyanakrishnan
  • Jean-Yves

📝 How to Use This Repo

  • Read the literature_review.pdf file for full summaries
  • Check the References.bib file for full citations
  • Let me know if you have any other bandit questions! 🙋

This is my first literature review project and I performed this project under the supervision of 👨‍💼 Prof. Manjesh K. Hanawal from IIT Bombay, India.

About

This literature review delves into the world of multi-armed bandit problems, exploring their applications and solutions in sequential decision-making scenarios

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published