These tasks include sliding a disc across a table until it hits a given target and manipulating a pen until it achieves a desired position and rotation.
At first, the simulated robots are unsuccessful at completing their given tasks, but as the new algorithm kicks in, they begin to train themselves to become more effective by reframing each failure as a success.
Advertisement
“The key insight that HER formalizes is what humans do intuitively,” the researchers wrote in the blog. “Even though we have not succeeded at a specific goal, we have at least achieved a different one. So why not just pretend that we wanted to achieve this goal to begin with, instead of the one that we set out to achieve originally?”
Join the conversation as a VIP Member