Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Tuesday December 11 2018, @03:02AM   Printer-friendly
from the how-about-a-nice-game-of-global-thermonuclear-war? dept.

Move over AlphaGo: AlphaZero taught itself to play three different games

Google's DeepMind—the group that brought you the champion game-playing AIs AlphaGo and AlphaGoZero—is back with a new, improved, and more-generalized version. Dubbed AlphaZero, this program taught itself to play three different board games (chess, Go, and shogi, a Japanese form of chess) in just three days, with no human intervention.

A paper describing the achievement was just published in Science. "Starting from totally random play, AlphaZero gradually learns what good play looks like and forms its own evaluations about the game," said Demis Hassabis, CEO and co-founder of DeepMind. "In that sense, it is free from the constraints of the way humans think about the game."

[...] As [chess grand master Garry] Kasparov points out in an accompanying editorial in Science, these days your average smartphone chess playing app is far more powerful than Deep Blue. So AI researchers turned their attention in recent years to creating programs that can master the game of Go, a hugely popular board game in East Asia that dates back more than 2,500 years. It's a surprisingly complicated game, much more difficult than chess, despite only involving two players with a fairly simple set of ground rules. That makes it an ideal testing ground for AI.

AlphaZero is a direct descendent of DeepMind's AlphaGo, which made headlines worldwide in 2016 by defeating Lee Sedol, the reigning (human) world champion in Go. Not content to rest on its laurels, AlphaGo got a major upgrade last year, becoming capable of teaching itself winning strategies with no need for human intervention. By playing itself over and over again, AlphaGo Zero (AGZ) trained itself to play Go from scratch in just three days and soundly defeated the original AlphaGo 100 games to 0. The only input it received was the basic rules of the game.

[...] AGZ was designed specifically to play Go. AlphaZero generalizes this reinforced-learning approach to three different games: Go, chess, and shogi, a Japanese version of chess. According to an accompanying perspective penned by Deep Blue team member Murray Campbell, this latest version combines deep reinforcement learning (many layers of neural networks) with a general-purpose Monte Carlo tree search method.

"AlphaZero learned to play each of the three board games very quickly by applying a large amount of processing power, 5,000 tensor processing units (TPUs), equivalent to a very large supercomputer," Campbell wrote.

[...] DOI: Science, 2018. 10.1126/science.aar6404 (About DOIs).


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 0) by Anonymous Coward on Tuesday December 11 2018, @12:26PM (10 children)

    by Anonymous Coward on Tuesday December 11 2018, @12:26PM (#772831)

    One step closer to general AI.

  • (Score: 2) by sgleysti on Tuesday December 11 2018, @02:39PM (4 children)

    by sgleysti (56) Subscriber Badge on Tuesday December 11 2018, @02:39PM (#772863)

    This is an impressive achievement. That said, I'd love to see a program that can play Myst.

    • (Score: 2) by takyon on Tuesday December 11 2018, @02:41PM (1 child)

      by takyon (881) <reversethis-{gro ... s} {ta} {noykat}> on Tuesday December 11 2018, @02:41PM (#772867) Journal

      I want to see the same system playing Go, Myst, Thief, and Starcraft 2 at the same time, while analyzing astronomy images to look for exoplanets.

      --
      [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
      • (Score: 0) by Anonymous Coward on Tuesday December 11 2018, @06:43PM

        by Anonymous Coward on Tuesday December 11 2018, @06:43PM (#772992)

        https://www.youtube.com/watch?v=fzuYEStsQxc [youtube.com]

        If we are to call our learner intelligent, then we need one algorithm which is able to solve a large number of different problems. If we need to reprogram it for every game, that's just a narrow intelligence.

        This algorithm solves a bunch of Atari games, and Mario, and a 3D maze.

    • (Score: 2) by DannyB on Tuesday December 11 2018, @03:36PM (1 child)

      by DannyB (5839) Subscriber Badge on Tuesday December 11 2018, @03:36PM (#772886) Journal

      It is indeed impressive. And I don't mean to downplay it.

      But like the magician's trick, once you know how the trick works, it quickly becomes normalized. Look at the rapid normalization of other technology miracles:
      * [things on recent SN poll]
      * automobiles
      * x-rays
      * radio, tv (which must have seemed magical)
      * moon landings
      * cheap computers
      * ubiquitous computers
      * cell phones
      * the intarweb tubes
      * smart phones
      * practical home automation
      * excellent speech synthesis
      * useful speech recognition
      * computer vision
      * deep fakes
      * self driving cars
      * Reddit

      AlphaGo and AlphaZero will quickly become the norm. The magician's trick that we understand. I would also point out that hardware to directly support AI applications is now the norm. I would bet sooner than we think, AI "coprocessors" will become a thing in mainstream PCs.

      --
      People today are educated enough to repeat what they are taught but not to question what they are taught.
      • (Score: 3, Informative) by takyon on Tuesday December 11 2018, @05:27PM

        by takyon (881) <reversethis-{gro ... s} {ta} {noykat}> on Tuesday December 11 2018, @05:27PM (#772948) Journal

        I would bet sooner than we think, AI "coprocessors" will become a thing in mainstream PCs.

        It could be a while. There's a lot going on, the AI coprocessors are still in their infancy (deep learning tensor/machine learning accelerators are available, but not something like IBM TrueNorth), and there isn't a lot of focus on PCs (laptops and desktops).

        Laptop APUs should include such hardware, but they don't, or at least they didn't until we started seeing Snapdragon ARM processors coming to laptops. The hyped Snapdragon 8cx [soylentnews.org] for laptops supposedly includes the same "integrated AI engine" [laptopmag.com] found in previous Snapdragon chips. More details here [qualcomm.com].

        For desktop users, many Nvidia GPUs are now rated for a certain amount of low-precision tensor performance. While you could argue that this isn't a discrete "AI coprocessor", it should compare favorably with Google's TPUs. The TPUs might have higher tensor performance and lower power consumption, but the difference between the two lines shouldn't be an order of magnitude or something.

        Arguably, smartphones benefit the most from AI hardware, since they are good at interacting with the world (especially using the camera [engadget.com]). But you could still see a laptop or desktop user doing something like running Mycroft locally, or training deep fakes, etc.

        In the long run, we want coprocessors that don't just accelerate 8-bit operations, but use an entirely different neuromorphic, brain-inspired architecture, such as IBM TrueNorth [wikipedia.org]. If a new technology [soylentnews.org] can massively extend Moore's law by allowing 3D integrated circuits, then we could see dramatic improvements in neuromorphic chips. Go figure, a brain-inspired chip that simulates how neurons and synapses work would be improved by being 3D (layered), kind of lie how our brains aren't flat discs.

        We might see some movement on that soon. DARPA has been working on monolithic 3D ICs. They believe that a 90 nanometer process 3D chip could greatly outperform (35-75x) a "7nm" process chip [darpa.mil] at machine learning tasks, and that a "7nm" 3D chip would leave it in the dust (323-645x). And this is for a rudimentary 3D design. They could give an update on this in 2019, so keep your eyes open.

        If all of the "AI coprocessors" used from consumer smartphones and discrete GPUs all the way up to Google's specialized chips in data centers were to experience a 100x performance increase, then you would see some real magic.

        --
        [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
  • (Score: 4, Insightful) by DannyB on Tuesday December 11 2018, @03:27PM (2 children)

    by DannyB (5839) Subscriber Badge on Tuesday December 11 2018, @03:27PM (#772885) Journal

    When it can teach itself to get modded Funny, then I'll be impressed.

    --
    People today are educated enough to repeat what they are taught but not to question what they are taught.
  • (Score: 0) by Anonymous Coward on Tuesday December 11 2018, @11:46PM (1 child)

    by Anonymous Coward on Tuesday December 11 2018, @11:46PM (#773186)

    Shoudn't it be beating everyone in the stock market by now?