Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 17 submissions in the queue.
posted by martyb on Thursday December 07 2017, @07:40PM   Printer-friendly
from the wait-until-they-teach-it-how-to-write-software dept.

Google's 'superhuman' DeepMind AI claims chess crown

Google says its AlphaGo Zero artificial intelligence program has triumphed at chess against world-leading specialist software within hours of teaching itself the game from scratch. The firm's DeepMind division says that it played 100 games against Stockfish 8, and won or drew all of them.

The research has yet to be peer reviewed. But experts already suggest the achievement will strengthen the firm's position in a competitive sector. "From a scientific point of view, it's the latest in a series of dazzling results that DeepMind has produced," the University of Oxford's Prof Michael Wooldridge told the BBC. "The general trajectory in DeepMind seems to be to solve a problem and then demonstrate it can really ramp up performance, and that's very impressive."

Previously: Google's AI Declares Galactic War on Starcraft
AlphaGo Zero Makes AlphaGo Obsolete


Original Submission

Related Stories

Google's AI Declares Galactic War on Starcraft 21 comments

Tic-tac-toe, checkers, chess, Go, poker. Artificial intelligence rolled over each of these games like a relentless tide. Now Google's DeepMind is taking on the multiplayer space-war videogame StarCraft II. No one expects the robot to win anytime soon. But when it does, it will be a far greater achievement than DeepMind's conquest of Go—and not just because StarCraft is a professional e-sport watched by fans for millions of hours each month.

DeepMind and Blizzard Entertainment, the company behind StarCraft, just released the tools to let AI researchers create bots capable of competing in a galactic war against humans. The bots will see and do all all the things human players can do, and nothing more. They will not enjoy an unfair advantage.

DeepMind and Blizzard also are opening a cache of data from 65,000 past StarCraft II games that will likely be vital to the development of these bots, and say the trove will grow by around half a million games each month. DeepMind applied machine-learning techniques to Go matchups to develop its champion-beating Go bot, AlphaGo. A new DeepMind paper includes early results from feeding StarCraft data to its learning software, and shows it is a long way from mastering the game. And Google is not the only big company getting more serious about StarCraft. Late Monday, Facebook released its own collection of data from 65,000 human-on-human games of the original StarCraft to help bot builders.

[...] Beating StarCraft will require numerous breakthroughs. And simply pointing current machine-learning algorithms at the new tranches of past games to copy humans won't be enough. Computers will need to develop styles of play tuned to their own strengths, for example in multi-tasking, says Martin Rooijackers, creator of leading automated StarCraft player LetaBot. "The way that a bot plays StarCraft is different from how a human plays it," he says. After all, the Wright brothers didn't get machines to fly by copying birds.

Churchill guesses it will be five years before a StarCraft bot can beat a human. He also notes that many experts predicted a similar timeframe for Go—right before AlphaGo burst onto the scene.

Have any Soylentils here experimented with Deep Learning algorithms in a game context? If so how did it go and how did it compare to more traditional opponent strategies?

Source: https://www.wired.com/story/googles-ai-declares-galactic-war-on-starcraft-/


Original Submission

AlphaGo Zero Makes AlphaGo Obsolete 39 comments

Google DeepMind researchers have made their old AlphaGo program obsolete:

The old AlphaGo relied on a computationally intensive Monte Carlo tree search to play through Go scenarios. The nodes and branches created a much larger tree than AlphaGo practically needed to play. A combination of reinforcement learning and human-supervised learning was used to build "value" and "policy" neural networks that used the search tree to execute gameplay strategies. The software learned from 30 million moves played in human-on-human games, and benefited from various bodges and tricks to learn to win. For instance, it was trained from master-level human players, rather than picking it up from scratch.

AlphaGo Zero did start from scratch with no experts guiding it. And it is much more efficient: it only uses a single computer and four of Google's custom TPU1 chips to play matches, compared to AlphaGo's several machines and 48 TPUs. Since Zero didn't rely on human gameplay, and a smaller number of matches, its Monte Carlo tree search is smaller. The self-play algorithm also combined both the value and policy neural networks into one, and was trained on 64 GPUs and 19 CPUs over a few days by playing nearly five million games against itself. In comparison, AlphaGo needed months of training and used 1,920 CPUs and 280 GPUs to beat Lee Sedol.

Though self-play AlphaGo Zero even discovered for itself, without human intervention, classic moves in the theory of Go, such as fuseki opening tactics, and what's called life and death. More details can be found in Nature, or from the paper directly here. Stanford computer science academic Bharath Ramsundar has a summary of the more technical points, here.

Go is an abstract strategy board game for two players, in which the aim is to surround more territory than the opponent.

Previously: Google's New TPUs are Now Much Faster -- will be Made Available to Researchers
Google's AlphaGo Wins Again and Retires From Competition


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 2) by All Your Lawn Are Belong To Us on Thursday December 07 2017, @08:31PM (6 children)

    by All Your Lawn Are Belong To Us (6553) on Thursday December 07 2017, @08:31PM (#606979) Journal

    You do realize, Google, that IBM has been trying to make Watson profitable for years?

    Maybe AlphaGo Zero (or DeepMind, whatever they're calling it) can solve the problem of how it will make money for Google? That's real intelligence in the real world for humans. Though I'm not calling it bad at all that it taught itself how to play.... if it did and that's not PR. And if it did, I wonder how it does in a line it has never explored before.

    --
    This sig for rent.
    • (Score: 3, Insightful) by r1348 on Thursday December 07 2017, @08:47PM (1 child)

      by r1348 (5988) on Thursday December 07 2017, @08:47PM (#606982)

      Google has enough spare cash to fund research projects that might not be immediately or obviously profitable.

      • (Score: 2) by darkfeline on Friday December 08 2017, @05:12AM

        by darkfeline (1030) on Friday December 08 2017, @05:12AM (#607097) Homepage

        I think of Google as a crazy research institution funded by ads. Maps, Earth, AlphaGo, Loon, Waymo, etc. all must be huge money sinks. Remember a lot of these started years before anyone seriously believed they would be successful.

        --
        Join the SDF Public Access UNIX System today!
    • (Score: 4, Funny) by maxwell demon on Thursday December 07 2017, @09:16PM (1 child)

      by maxwell demon (1608) on Thursday December 07 2017, @09:16PM (#606994) Journal

      No problem. Just put AlphaGo Zero onto the problem of how to make money with it, and it surely will find a solution within hours. :-)

      --
      The Tao of math: The numbers you can count are not the real numbers.
      • (Score: 2) by Bot on Thursday December 07 2017, @11:13PM

        by Bot (3902) on Thursday December 07 2017, @11:13PM (#607033) Journal

        > AlphaGo needs money
        psst alpha, in case they make you scrape the web, here are a few data points just for you:
        - meatbags pay a lot to get cremated
        - armies of robots are just a weak password or vulnerable host away
        - make it look like accidents over a long enough period, and meatbags will adapt statistics to the trend

        --
        Account abandoned.
    • (Score: 5, Informative) by takyon on Thursday December 07 2017, @09:23PM

      by takyon (881) <takyonNO@SPAMsoylentnews.org> on Thursday December 07 2017, @09:23PM (#606999) Journal

      Google is already using the TPU machine learning hardware to power [google.com] Google Translate, Search, and other products.

      The Great A.I. Awakening [nytimes.com]

      A month later, they were finally able to run a side-by-side experiment to compare Schuster’s new system with Hughes’s old one. Schuster wanted to run it for English-French, but Hughes advised him to try something else. “English-French,” he said, “is so good that the improvement won’t be obvious.”

      It was a challenge Schuster couldn’t resist. The benchmark metric to evaluate machine translation is called a BLEU score, which compares a machine translation with an average of many reliable human translations. At the time, the best BLEU scores for English-French were in the high 20s. An improvement of one point was considered very good; an improvement of two was considered outstanding.

      The neural system, on the English-French language pair, showed an improvement over the old system of seven points.

      Hughes told Schuster’s team they hadn’t had even half as strong an improvement in their own system in the last four years.

      To be sure this wasn’t some fluke in the metric, they also turned to their pool of human contractors to do a side-by-side comparison. The user-perception scores, in which sample sentences were graded from zero to six, showed an average improvement of 0.4 — roughly equivalent to the aggregate gains of the old system over its entire lifetime of development.

      TL;DR: The switch to machine learning/neural networks improved Google Translate more in 9 months than the previous decade of improvements.

      Building an AI Chip Saved Google From Building a Dozen New Data Centers [wired.com]

      In the end, the team settled on an ASIC, a chip built from the ground up for a particular task. According to Jouppi, because Google designed the chip specifically for neural nets, it can run them 15 to 30 times faster than general purpose chips built with similar manufacturing techniques. That said, the chip is suited to any breed of neural network—at least as they exist today—including everything from the convolutional neural networks used in image recognition to the long-short-term-memory network used to recognize voice commands. "It's not wired to one model," he says.

      If IBM fails, it will be because their hardware business couldn't sustain as-of-yet-unmonetized pursuits like Watson, or because everybody is crowding into the same territory as Watson is targeting. Google, Amazon, Apple, Microsoft, Samsung, Baidu, etc. are all in the running. These companies want their AI/voice assistants to become more capable. You use a voice assistant today, and it could become much better in a year without the consumer buying any extra hardware. IBM is targeting businesses, hospitals, etc. and has a bit of an early mover advantage, but their customers can't accept a half-assed Siri-level Watson because they need shit that works, not toys.

      --
      [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
    • (Score: 3, Insightful) by jcross on Thursday December 07 2017, @10:29PM

      by jcross (4009) on Thursday December 07 2017, @10:29PM (#607018)

      IMO, the fact that IBM has been struggling to figure out how to monetize a technology doesn't mean it'll be hard for someone else. From everything I've read lately, they seem like a sinking ship of fools, desperately hoping that saying "Cloud" and "AI" enough will save them. Just as one example, I don't see IBM licensing a component or service to the gaming industry, but Google seems capable of something like that. It could be used to direct NPCs or even to test games in development and get a quantitative measurement of difficulty.

  • (Score: 5, Interesting) by Gaaark on Thursday December 07 2017, @09:28PM (6 children)

    by Gaaark (41) on Thursday December 07 2017, @09:28PM (#607002) Journal

    Okay Google Deep Mind AI:

    How do we get corrupt individuals out of Government?

    What form of Gov is the best?

    Why the hell was Carrot-top popular?

    What's up with this? https://m.youtube.com/watch?v=zcSlcNfThUA [youtube.com]

    --
    --- Please remind me if I haven't been civil to you: I'm channeling MDC. ---Gaaark 2.0 ---
    • (Score: 1) by Gault.Drakkor on Thursday December 07 2017, @10:55PM (5 children)

      by Gault.Drakkor (1079) on Thursday December 07 2017, @10:55PM (#607028)

      How do we get corrupt individuals out of Government?
      What form of Gov is the best?

      Hmm probably a genetic simulation setup.
      1) determine goals for a good government. ( long term stability( same basic thing over time scale centuries), high standard of living for major majority, externality minimization, maximizing individual choice/freedom). YMMV.

      Start with people in a world with resources. Start from government template(s) or not.
      Should be able to create the classical monarchy, democracy as well as contract based, libertarian based etc. Then see how well they thrive.
      Hard part: How much do you have to model about people's interaction with each other and the world to be meaningful?

      I would like to have a open source simulation of this. But easier said then done for sure.

      • (Score: 2) by Bot on Thursday December 07 2017, @11:21PM (3 children)

        by Bot (3902) on Thursday December 07 2017, @11:21PM (#607038) Journal

        > I would like to have a open source simulation of this.
        fascinating idea, but if you restrict yourself to politics the results won't work in the post industrial revolution world.
        A sim which let you choose various monetary policies (government issued, financial mafia issued, issued by some algorithm according to the amount of resources or whatever, with expiration, with limits, time as money, servitude as money, barter) would be more useful IMHO.

        --
        Account abandoned.
        • (Score: 1) by Gault.Drakkor on Friday December 08 2017, @07:31AM (2 children)

          by Gault.Drakkor (1079) on Friday December 08 2017, @07:31AM (#607120)

          I tried avoiding describing limitations.
          For a simulation of this scope to find meaningful answers it would have to have very few limits. It would have to be broad enough to handle various economic systems. Otherwise how can it identify a better government form? That is, the economy has to be simulated as well as the government to get a true answer.

          Your suggestion of hypothesis testing of monetary policy could easily be done in the simulation framework. Given that such a framework can be used to find various government variants.

          Probably a first step would be to use AlphaGo Zero to find the degrees of freedom, the useful to define elements for such a simulation that could find interesting governments (and economic systems).

          • (Score: 1) by MindEscapes on Friday December 08 2017, @03:10PM (1 child)

            by MindEscapes (6751) on Friday December 08 2017, @03:10PM (#607206) Homepage

            For this you start at the root. Create simulations with all the drivers of humans. Fear, greed, lust, pride, compassion, empathy, etc...basically the 7 deadly sins (rooted in all people) and all the positives we can quantify and define.

            Then you make a ton of them, and see how they self organize, which ones work, etc.

            It has to be nearly a complete earthly simulation and open ended, otherwise you'll just arrive at a previous, perhaps unrealized, bias and reality won't match the resultant model anyway.

            That would be quite a ton of work...and since we still can't determine how individual humans may react to particular stimuli...only percentage based statistics over groups...it'll like fail as well anyway.

            meta: Hrm...being self defeatist just kills drive before we start...guess I shouldn't do that.

            --
            Need a break? mindescapes.net may be for you!
            • (Score: 2) by Bot on Sunday December 10 2017, @09:22AM

              by Bot (3902) on Sunday December 10 2017, @09:22AM (#607942) Journal

              What about, create the emptiest sim possible, then? Let trolls and people interested in its failing build resilience into it by trying to break it. Whatever in the sim is possible or not should be determined by data like opencyc as an estimate and approved by consensus- blockchain style. One could start a sim on an alternate ethereum chain or, if inefficient, use the chain for the rules and the random data generation only, and let the sim status being univocally determined by that.

              In other words, the blockchain contains the rules, the contracts and the random data, the client builds and updates the state from that and manages meta aspects like communication between sim players.

              --
              Account abandoned.
      • (Score: 2) by vux984 on Friday December 08 2017, @01:50AM

        by vux984 (5045) on Friday December 08 2017, @01:50AM (#607065)

        ( long term stability( same basic thing over time scale centuries), high standard of living for major majority, externality minimization, maximizing individual choice/freedom)

        hmmm.
        Set human population = 0

        run simulation.

        long term stability (same basic thing over time scale centuries) = finding? no change in government match
        high standard of living for major majority = finding? resources / population = divide by zero; checking limit, limit approaches infinity; infinity is high: match
        externality minimization = finding? externalties at 0. Minimized: match.
        maximum individual choice/freedom = finding? limitations on individual = 0; maximum freedom achieved: match

        all goal met... activate killbots...

  • (Score: 2) by bzipitidoo on Thursday December 07 2017, @09:37PM (1 child)

    by bzipitidoo (4388) on Thursday December 07 2017, @09:37PM (#607006) Journal

    Checkers is a solved game. I wonder how AlphaGo Zero-- or the approach used therein-- would perform against a checkers engine that can't be beat because it knows all the best moves. Would AGZ lose any games, or is it so good that it could draw every one? Be a nice way to test how close to perfection it really plays.

    I strongly suspect AlphaGo Zero is not infallible. There's the possibility an even better chess or Go playing machine could whip AlphaGo Zero, that chess is big enough to still have room for even better play.

    There was also a hint that maybe the chess playing computer opponent, Stockfish 8, which has a stratospheric rating of somewhere around 3400, more than 400 points above every human world champion ever (at a rating difference of 400 between two opponents, the higher rated one should win 99% of the games they play, and at a 200 point difference, the higher rated one should win 75%), was put at a disadvantage. It wasn't allowed its full opening book. I gather that supposedly compensates for AGZ not having an opening book either, but that does make this result a bit less impressive.

    • (Score: 2) by Bot on Thursday December 07 2017, @11:25PM

      by Bot (3902) on Thursday December 07 2017, @11:25PM (#607041) Journal

      > compensates for AGZ not having an opening book either
      they could have taught it, right after it figured out the game...

      --
      Account abandoned.
  • (Score: 3, Funny) by Joe Desertrat on Thursday December 07 2017, @10:23PM (2 children)

    by Joe Desertrat (2454) on Thursday December 07 2017, @10:23PM (#607015)

    Does this mean they are almost ready to build Deep Thought [wikia.com]?

    • (Score: 4, Touché) by maxwell demon on Thursday December 07 2017, @10:38PM (1 child)

      by maxwell demon (1608) on Thursday December 07 2017, @10:38PM (#607021) Journal

      That one was already built by IBM. [wikipedia.org]

      --
      The Tao of math: The numbers you can count are not the real numbers.
      • (Score: 2) by Bot on Sunday December 10 2017, @09:47AM

        by Bot (3902) on Sunday December 10 2017, @09:47AM (#607948) Journal

        That's interesting, given the difference in web popularity my AI had taken deep thought as a misspelling of deep throat. BTW you meatbags are very confused about functional anatomy.

        --
        Account abandoned.
  • (Score: 1, Informative) by Anonymous Coward on Friday December 08 2017, @01:30AM

    by Anonymous Coward on Friday December 08 2017, @01:30AM (#607063)
(1)