Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Thursday October 19 2017, @02:39PM   Printer-friendly
from the Zeroing-in-on-AI dept.

Google DeepMind researchers have made their old AlphaGo program obsolete:

The old AlphaGo relied on a computationally intensive Monte Carlo tree search to play through Go scenarios. The nodes and branches created a much larger tree than AlphaGo practically needed to play. A combination of reinforcement learning and human-supervised learning was used to build "value" and "policy" neural networks that used the search tree to execute gameplay strategies. The software learned from 30 million moves played in human-on-human games, and benefited from various bodges and tricks to learn to win. For instance, it was trained from master-level human players, rather than picking it up from scratch.

AlphaGo Zero did start from scratch with no experts guiding it. And it is much more efficient: it only uses a single computer and four of Google's custom TPU1 chips to play matches, compared to AlphaGo's several machines and 48 TPUs. Since Zero didn't rely on human gameplay, and a smaller number of matches, its Monte Carlo tree search is smaller. The self-play algorithm also combined both the value and policy neural networks into one, and was trained on 64 GPUs and 19 CPUs over a few days by playing nearly five million games against itself. In comparison, AlphaGo needed months of training and used 1,920 CPUs and 280 GPUs to beat Lee Sedol.

Though self-play AlphaGo Zero even discovered for itself, without human intervention, classic moves in the theory of Go, such as fuseki opening tactics, and what's called life and death. More details can be found in Nature, or from the paper directly here. Stanford computer science academic Bharath Ramsundar has a summary of the more technical points, here.

Go is an abstract strategy board game for two players, in which the aim is to surround more territory than the opponent.

Previously: Google's New TPUs are Now Much Faster -- will be Made Available to Researchers
Google's AlphaGo Wins Again and Retires From Competition


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 0) by Anonymous Coward on Thursday October 19 2017, @03:56PM (3 children)

    by Anonymous Coward on Thursday October 19 2017, @03:56PM (#584619)

    They can be partially simulated, but there's no definitive, easily applicable determination of "success" that can be applied by the software, because the environment isn't static (a go board is.)

    So maybe the next goal would be an AI that can win on Nomic? After all, the whole point of Nomic is changing the rules. With a majority of humans in the game, the machine will not even be able to predict all the rules that will be in effect at the next turn.

  • (Score: 2) by fyngyrz on Thursday October 19 2017, @09:15PM (2 children)

    by fyngyrz (6567) on Thursday October 19 2017, @09:15PM (#584890) Journal

    I suspect it'll go the other way around; AI will come from (somewhere), and then you'll have a system that will have a chance to win on/at Nomic.

    However, there will also be a question, at that point, of whether the AI cares to play Nomic in the first place. Once you have a system that can locally analyze the value of doing something, it'll use that to evaluate whether it should engage in the associated undertaking. Because... intelligent.

    Unless we implement manufactured intelligences as outright slaves. I hope we don't do that. I don't think it will go well for us if we do. If we want that kind of service, stacked LDNLS systems are the way to go, specifically because they are in no wise intelligent entities, they're just (very) elaborate mechanisms. They'll keep getting better, and perhaps the AIs will even help us with them, if and when AI arises.

    Slavery is bad, mmmm'kay?

    • (Score: 2, Disagree) by maxwell demon on Thursday October 19 2017, @10:18PM (1 child)

      by maxwell demon (1608) on Thursday October 19 2017, @10:18PM (#584942) Journal

      However, there will also be a question, at that point, of whether the AI cares to play Nomic in the first place. Once you have a system that can locally analyze the value of doing something, it'll use that to evaluate whether it should engage in the associated undertaking. Because... intelligent.

      You seem to be under the delusion that there is a set of values that you can derive from rational thought alone.

      It doesn't work that way. No matter how much you think, you'll always at some point arrive at some other value that you simply have to assume. You may end up at values that come straight out of evolution (an intelligent being that doesn't value its own life likely won't survive long), or at values that your parents (or any other people you accepted as moral authorities) taught you at young age and which you never dared to question (or which just to question you already consider a morally bad thing to do, probably again because someone taught you so).

      --
      The Tao of math: The numbers you can count are not the real numbers.
      • (Score: 2) by rylyeh on Thursday October 19 2017, @10:22PM

        by rylyeh (6726) <{kadath} {at} {gmail.com}> on Thursday October 19 2017, @10:22PM (#584946)

        A point of view is a slippery thing.

        --
        "a vast crenulate shell wherein rode the grey and awful form of primal Nodens, Lord of the Great Abyss."