Welcome to my digital playground where I explore the world of machine learning. Join me as I delve into my favorite areas like Generative Models and Reinforcement Learning. I will share some of my notes and resources that I found helpful. I hope you find them useful as well.
A great general resource on Machine Learning is the Book by Kevin P. Murphy "Probabilistic Machine Learning an Introduction". The book is available online: Probabilistic Machine Learning an Introduction. The accompanying Github is also worth checking out. For example, he lists some practical resources: ProbML Tutorials, ProbML Python Tutorials. The code used for making the figures is also available on github. The second book goes more into detail on more advanced topics and is also available online: Probabilistic Machine Learning an Introduction. There are many other great resources on machine learning. I list some others I liked and think are woth notig below.
-
Computer Vision: If you are interested in computer vision and convolutional neural networks I can recommend the lecture notes of the Stanford Computer Vision Course CS231n.
-
Transformer: Andrej Kaparthy made a nice and very hands on introduction on natural language processing starting very general with backpropagation and building its way up to transformers: Neural Networks: Zero to Hero, with the youtube playlist.
-
Huggingface 🤗: The Huggingface website is a great resource for machine learning in general. They provide many pretrained models and tutorials on how to use them. Especially if you are interested in state-of-the-art models which are expensive to train like transformer or diffusion models. They also provide a great 🤗 natural language processing Course. Their interface is very easy to use and with a few lines of code you can read and generate text or images.
-
Visualization: If you are using Python I can recommend the Scientific Visualization Book and the Matplotlib Cheatsheet. If you are interested in the visualization of machine learning models, I can recommend the Distill website. They have great articles on machine learning and visualization of these models and concepts.
Ever wondered how Diffusion and Score-Based Generative Models work? Check out my beginner-friendly guide on Score-Based and Denoising Diffusion Models, where I break it down using Jupyter books. This introduction is inspired by amazing resources such as:
-
Diffusion-Denoising Models: For a very good introduction into diffusion models you can check out the blog post from Lilian Weng: Diffusion-Denoising Models. Her blog is also a treasure trove on understandable guides of machine learning. Highly recommend visiting Lilian Weng's Blog: Lilian Weng's Blog
-
Score-Based Generative Models Yang Song's introduction is an excellent starting point: Score-Based Generative Models. His talk on Youtube on Score-Based Generative Models is also very insightful.
-
Comprehensive Denoising Diffusion-based Generative Modeling Tutorial: At CVPR 2022 there was a great Tutorial on Denoising Diffusion-based Generative Modeling - Foundations and Applications.
Reinforcement Learning (RL) considers sequential decision making problems in a Markov decision processes (MDP). An agent interacts with an environment by choosing an action
Reinforcement Learning - An Introduction by Richard S. Sutton and Andrew G. Barto is a classic and a great in depth introduction into RL and MDPs. For a more practical introduction into RL I can recomment the following websites that also introduce mordern and commonly used RL algorithms:
- Welcome to Spinning Up in Deep RL! — Spinning Up documentation
- Welcome to the 🤗 Deep Reinforcement Learning Course - Hugging Face Deep RL Course
Explore practical implementations of RL algorithms with the following repositories:
- CleanRL (Clean Implementation of RL Algorithms) is doing a great job in providing a clean implementation of RL algorithms, where each algorithm if implemented in a single file.
- Stable-Baselines implements many RL algorithms in PyTorch that are proven to work well, and thereby builds a great framework for stable RL implementations.
Blog Post on debugging RL: Debugging RL can be hard (and sometimes even frustrating). I found the following blog post very helpful (and actually also very entertaining): Deep Reinforcement Learning Doesn't Work Yet. Note that the article is from 2018 but I find many points are still valid today and the article is a great read. Another great article on debugging RL is: Debugging Reinforcement Learning Systems
Lecture on RL: If you want an even more in-depth introduction into RL, I can recommend the following lecture on RL: CS 285 at UC Berkeley
Model-based reinforcement learning (MBRL) also considers sequential decision making problems in a Markov decision processes. With States
If the model is learnable, the agent can collect more data to improve the model. How the agent uses the model to decide which action to take is based on its approach to planning. The agent can also learn a policy to predict actions (similar to model free RL approaches) to improve its planning based on experiences. Similarly other estimates like inverse dynamics models (mapping from states to actions) or reward models (predicting rewards) can be useful in this framework. One example to planning is model-predictive control (MPC). Where the method optimizes the expected reward by searching the best actions. The actions are sampled for example using a uniformly distributed set of actions.
You can read more on model-based RL in a blog on Debugging Deep Model-based Reinforcement Learning Systems and a recent survey on Model-based Reinforcement Learning: A Survey.
Imitation learning (IL) describes methods that learn optimal behavior that is represented by a collection of expert demonstrations. In IL, the agent also interacts with an environment in a sequential decision making process and therefore methods from RL can help to effectively solve IL problems. However, different to the RL-setting, does the agent not receive a reward from the environment. Instead, IL assumes that the experience comes from an expert policy (which behaves perfectly considering the task).
Therefore, IL can alleviate the problem of designing effective reward functions. This is particularly useful for tasks where demonstrations are more accessible than designing a reward function. One example is to train traffic agents in a simulation to mimic real-world road users. In this case, it is easier to collect demonstrations of real-world road users than to design a reward function that captures all aspects of the task. A great overview of Imitation Learning is given in "An Algorithmic Perspective on Imitation Learning". A recent Imitation Learning method is IQ-Learn which is based on soft Q-Learning and learns a Q-function using the demonstration data. The authors showed that there is a one-to-one mapping between the learned Q-function and the underlying reward function and they can sucessfully estimate an reward based on the learned Q-function. We developed a method which does not require actions to be available in the expert data and can be used with state-only demonstrations. The method is called Imitation Learning by State-Only Distribution Matching and uses an expert and environment models to capture the reward function. Another great resource is the repository of OPOLO: Off-policy Learning from Observations which implements many IL methods.
Similar to IL is Offline Reinforcement Learning (Offline RL) a type of RL in which the agent learns from a dataset of previously collected experiences. However in Offline RL the agent learns without interacting with the environment during training. This makes it different from online RL, in which the agent learns while interacting with the environment. In IL both approaches are possible, while methods that rely on online interaction generally perform better. Both offline RL and IL rely on a dataset of experiences to learn from. They are, however, not the same. While Imitation Learning
For example imitation learning may be better suited for learning traffic agents to behave like real-world road users. The reason is that every recorded road user is by definition an "expert on human driving". However, if the task is to learn to drive a car as safe as possible, then offline RL is probably better suited. The reason is that the dataset of human driving behavior is not necessarily optimal for the task of driving as safe as possible. There is a good article discussing this topic: Should I Use Offline RL or Imitation Learning?. Using Offline RL can help to learn a policy to perform a task before it is deployed in the real-world, where it might be dangerous to learn a policy online. A good overview on Offline RL is given in "Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems". Sergey Levin and Aviral Kumar gave a great tutorial at NeurIPS 2020 on Offline RL.
I held the exercises for the course "Numerical Methods for Engineers" at the University of Augsburg, Germany. I've made a jupyter book out of the exercises and the exercises itself are availale at: Numerical Methods for Engineers as Jupyter notebooks (partly written in German) with julia code. Many of my exerrcise are based on the great book Fundamentals of Numerical Computation by Toby A. Driscoll and Richard J. Braun.
I am a PhD student at the University of Augsburg, Germany. I am part of the Chair of Mechatronics at the Faculty of Applied Computer Science. My research interests lie in the field of machine learning, specifically in the areas of generative models and reinforcement learning.
If you have any questions or comments, feel free to reach out to me via mail: damian.boborzi@uni-a.de.