Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Tuesday December 11 2018, @03:02AM   Printer-friendly
from the how-about-a-nice-game-of-global-thermonuclear-war? dept.

Move over AlphaGo: AlphaZero taught itself to play three different games

Google's DeepMind—the group that brought you the champion game-playing AIs AlphaGo and AlphaGoZero—is back with a new, improved, and more-generalized version. Dubbed AlphaZero, this program taught itself to play three different board games (chess, Go, and shogi, a Japanese form of chess) in just three days, with no human intervention.

A paper describing the achievement was just published in Science. "Starting from totally random play, AlphaZero gradually learns what good play looks like and forms its own evaluations about the game," said Demis Hassabis, CEO and co-founder of DeepMind. "In that sense, it is free from the constraints of the way humans think about the game."

[...] As [chess grand master Garry] Kasparov points out in an accompanying editorial in Science, these days your average smartphone chess playing app is far more powerful than Deep Blue. So AI researchers turned their attention in recent years to creating programs that can master the game of Go, a hugely popular board game in East Asia that dates back more than 2,500 years. It's a surprisingly complicated game, much more difficult than chess, despite only involving two players with a fairly simple set of ground rules. That makes it an ideal testing ground for AI.

AlphaZero is a direct descendent of DeepMind's AlphaGo, which made headlines worldwide in 2016 by defeating Lee Sedol, the reigning (human) world champion in Go. Not content to rest on its laurels, AlphaGo got a major upgrade last year, becoming capable of teaching itself winning strategies with no need for human intervention. By playing itself over and over again, AlphaGo Zero (AGZ) trained itself to play Go from scratch in just three days and soundly defeated the original AlphaGo 100 games to 0. The only input it received was the basic rules of the game.

[...] AGZ was designed specifically to play Go. AlphaZero generalizes this reinforced-learning approach to three different games: Go, chess, and shogi, a Japanese version of chess. According to an accompanying perspective penned by Deep Blue team member Murray Campbell, this latest version combines deep reinforcement learning (many layers of neural networks) with a general-purpose Monte Carlo tree search method.

"AlphaZero learned to play each of the three board games very quickly by applying a large amount of processing power, 5,000 tensor processing units (TPUs), equivalent to a very large supercomputer," Campbell wrote.

[...] DOI: Science, 2018. 10.1126/science.aar6404 (About DOIs).


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.