For the Reinforcement Learning course at Edinburgh, I applied different RL techniques to a robot football scenario, specifically the half field offense (HFO) problem. Some of the algorithms that were covered included:
- SARSA, Q-Learning (Single-agent, discrete state space)
- Deep RL with Asynchronous Q-Learning (Single-agent, continuous state space)
- Independent Q-Learning, Joint Action Learning, WoLF-PHC (Multi-agent, discrete state space)
It was a really rewarding (ha!) course, and I hope to be able to revisit RL in some NLP setting at some point. Happy to share code upon request.