SoylentNews Comments | AlphaGo Zero Makes AlphaGo Obsolete

AlphaGo Zero Makes AlphaGo Obsolete

posted by martyb on Thursday October 19 2017, @02:39PM

from the Zeroing-in-on-AI dept.

Google DeepMind researchers have made their old AlphaGo program obsolete:

The old AlphaGo relied on a computationally intensive Monte Carlo tree search to play through Go scenarios. The nodes and branches created a much larger tree than AlphaGo practically needed to play. A combination of reinforcement learning and human-supervised learning was used to build "value" and "policy" neural networks that used the search tree to execute gameplay strategies. The software learned from 30 million moves played in human-on-human games, and benefited from various bodges and tricks to learn to win. For instance, it was trained from master-level human players, rather than picking it up from scratch.
AlphaGo Zero did start from scratch with no experts guiding it. And it is much more efficient: it only uses a single computer and four of Google's custom TPU1 chips to play matches, compared to AlphaGo's several machines and 48 TPUs. Since Zero didn't rely on human gameplay, and a smaller number of matches, its Monte Carlo tree search is smaller. The self-play algorithm also combined both the value and policy neural networks into one, and was trained on 64 GPUs and 19 CPUs over a few days by playing nearly five million games against itself. In comparison, AlphaGo needed months of training and used 1,920 CPUs and 280 GPUs to beat Lee Sedol.
Though self-play AlphaGo Zero even discovered for itself, without human intervention, classic moves in the theory of Go, such as fuseki opening tactics, and what's called life and death. More details can be found in Nature, or from the paper directly here. Stanford computer science academic Bharath Ramsundar has a summary of the more technical points, here.

Go is an abstract strategy board game for two players, in which the aim is to surround more territory than the opponent.

Previously: Google's New TPUs are Now Much Faster -- will be Made Available to Researchers
Google's AlphaGo Wins Again and Retires From Competition

Original Submission

Starting Score:

points

Moderation

Touché=2, Total=2

Extra 'Touché' Modifier

Total Score:

This discussion has been archived. No new comments can be posted.

AlphaGo Zero Makes AlphaGo Obsolete | Log In/Create an Account | Top | 39 comments | Search Discussion

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.

Re:New and improved Re:New and improved (Score: 2, Touché) by Anonymous Coward on Thursday October 19 2017, @03:47PM (3 children)

by Anonymous Coward on Thursday October 19 2017, @03:47PM (#584610)

OK so it breaks their Tabula Rasa condition but it would still have been somewhat more impressive, and useful, if it had discovered things we didn't already know.
[...]
Novel strategies ... Sounds like they won't be very useful. Have they previously been discovered by man and dismissed due to their level of novelty?

First you say novel strategies would be useful, then that they "won't be very useful". Which is it?

Parent

Starting Score:	0		points
Moderation		+2
Touché=2, Total=2
Extra 'Touché' Modifier		0

Total Score:		2

Re:New and improved Re:New and improved (Score: 2) by looorg on Thursday October 19 2017, @04:11PM (2 children)

by looorg (578) on Thursday October 19 2017, @04:11PM (#584631)

It clearly depends on the strategy. It would have been awesome if they had actually mentioned what it was. For all we know their new strategy is completely worthless for human players. Then what is it good for? When AlphaGo-1 tries to play AlphaGo-2 and they try and trick eachother?

Parent
- Re:New and improved Re:New and improved (Score: 2) by HiThere on Thursday October 19 2017, @07:13PM (1 child)
  
  by HiThere (866) on Thursday October 19 2017, @07:13PM (#584770) Journal
  
  Why should you expect it to be useful for human players? Perhaps it's a strategy that's only useful when you're playing against something better than any human player. Not that I expect this to be true, but your basic criterion seems to need justification.
  
  --
  Javascript is what you use to allow unknown third parties to run software you have no idea about on your computer.
  
  Parent
  - Re:New and improved (Score: 2) by takyon on Thursday October 19 2017, @08:13PM
    
    by takyon (881) <takyonNO@SPAMsoylentnews.org> on Thursday October 19 2017, @08:13PM (#584834) Journal
    
    You should note that aside from Human vs. AlphaGo, there have also been Human + AlphaGo vs. Human + AlphaGo [futurism.com] matches. Go players have been able to use computer software for assistance [wikipedia.org] for years, it's just going to get a lot more useful (to the point of being "cheaty").
    Not responding to any particular post, just throwing it out there.
    
    --
    [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
    
    Parent

Moderator Help

SoylentNews

SoylentNews is people

Navigation

Sections

SoylentNews

AlphaGo Zero Makes AlphaGo Obsolete

Re:New and improved Re:New and improved (Score: 2, Touché) by Anonymous Coward on Thursday October 19 2017, @03:47PM (3 children)

Re:New and improved Re:New and improved (Score: 2) by looorg on Thursday October 19 2017, @04:11PM (2 children)

Re:New and improved Re:New and improved (Score: 2) by HiThere on Thursday October 19 2017, @07:13PM (1 child)

Re:New and improved (Score: 2) by takyon on Thursday October 19 2017, @08:13PM