This looks really interesting! I tried exploring deep RL myself some time ago but could never get my agents to make any meaningful progress, and as someone with very little stats/ML background it was difficult to debug what was going wrong. Will try following this and seeing what happens!