... but the first video only shows the player character digging downwards without using any tools and eventually dying in lava. What?
The tools are admittedly really hard to see in the videos because of the timelapse and MP4 struggles a bit on the low resolution, but they are there :)
>> In Minecraft, the team used a protocol that gave Dreamer a ‘plus one’ reward every time it completed one of 12 progressive steps involved in diamond collection — including creating planks and a furnace, mining iron and forging an iron pickaxe.
And that is why it is never going to work in the real world: games have clear objectives with obvious rewards. The real world, not so much.
For a more general system, you can annotate videos with text descriptions of all the tasks that have been accomplished and when, then train a reward model on those to later RL against.