Artificial Intelligence (AI) has been blazing a trail in the past several years, beating skilled human players in strategy board games like chess and go. It has also excelled in playing video games and navigating digital three-dimensional mazes. But AI’s motor and spatial skills have seen limited success, except in few scenarios like drone racing.
A team of researchers from Europe’s ETH Zurich University have now demonstrated that AI’s learning capabilities can surpass humans’ in the physical realm too.
The research team, led by Professor Raffaello D’ Andrea and Thomas Bi, a PhD candidate, has built a robot AI using the popular labyrinth marble game to make informed decisions on potentially successfully behaviour by planning for the future.
The labyrinth game is a marble game consisting a maze with walls and holes. The goal of the game is the transport a metal ball from the start to the end by tilting the playing field using two knobs. These two knobs help the player from preventing the ball from falling into any of the holes.
While the game is straightforward, it requires motor and spatial skills to move the metal object from start to end.
The researchers’ approach towards build this AI system is similar to how humans learns — through experience. While playing the game, the AI robot, called CyberRunner, observes its own moves and receives rewards based on its performance. This happens with the help of computer vision. A camera looks down at the labyrinth board and tracks the moves.
The system creates a memory of all the collected experiences. And using this memory, the model-based reinforcement learning algorithm learns how the system behaves, and based on its understanding of the game it recognizes which strategies and behaviours work best to complete the task.
This constant learning helps the robot use the two knobs to improve its chance of success. That means, it does not stop playing to learn as the algorithm runs concurrently with the robot while the game is being played. So, the AI system keeps getting better after every run.
The learning on the real-world labyrinth was conducted in 6.06 hours, comprising 1.2 million time steps at a control rate of 55 samples per second. The AI robot outperformed the previously fastest recorded time, achieved by an extremely skilled human player, by over 6%, according to the researchers.
The researchers pointed out that CyberRunner naturally discovered shortcuts while it was playing the game. It even found ways to skip certain parts of the maze when the team had to step in and explicitly instruct the AI robot to not take any shortcuts.