Hi everyone!
After spending hundreds of hours, we're excited to finally share our progress in developing a reinforcement learning system to beat Pokémon Red. Our system successfully completes the game using a policy under 10M parameters, PPO, and a few novel techniques. With the release of Claude Plays Pokémon, now feels like the perfect time to showcase our work.
We'd love to get feedback!