from the battle-of-wit-and-skill-and-strategy dept.
DeepNash learns to play Stratego from scratch by combining game theory and model-free deep RL:
Game-playing artificial intelligence (AI) systems have advanced to a new frontier. Stratego, the classic board game that's more complex than chess and Go, and craftier than poker, has now been mastered. Published in Science, we present DeepNash, an AI agent that learned the game from scratch to a human expert level by playing against itself.
DeepNash uses a novel approach, based on game theory and model-free deep reinforcement learning. Its play style converges to a Nash equilibrium, which means its play is very hard for an opponent to exploit. So hard, in fact, that DeepNash has reached an all-time top-three ranking among human experts on the world's biggest online Stratego platform, Gravon.
Board games have historically been a measure of progress in the field of AI, allowing us to study how humans and machines develop and execute strategies in a controlled environment. Unlike chess and Go, Stratego is a game of imperfect information: players cannot directly observe the identities of their opponent's pieces.
[...] The value of mastering Stratego goes beyond gaming. In pursuit of our mission of solving intelligence to advance science and benefit humanity, we need to build advanced AI systems that can operate in complex, real-world situations with limited information of other agents and people. Our paper shows how DeepNash can be applied in situations of uncertainty and successfully balance outcomes to help solve complex problems.
[...] While we developed DeepNash for the highly defined world of Stratego, our novel R-NaD method can be directly applied to other two-player zero-sum games of both perfect or imperfect information. R-NaD has the potential to generalise far beyond two-player gaming settings to address large-scale real-world problems, which are often characterised by imperfect information and astronomical state spaces.
[...] In creating a generalisable AI system that's robust in the face of uncertainty, we hope to bring the problem-solving capabilities of AI further into our inherently unpredictable world.
Journal Reference:
Julien Perolat, Bart De Vylder, Daniel Hennes, et al., Mastering the game of Stratego with model-free multiagent reinforcement learning, Science, 378, 2022. https://doi.org/10.1126/science.add4679
(Score: 5, Informative) by richtopia on Wednesday January 04, @03:45PM
In case you read the article and start thinking you mis-remember your childhood, the numbering of the pieces are reversed between the US and EU. And I guess there have been a whole bunch of US reprintings that also reverse the sequence.
https://en.wikipedia.org/wiki/Stratego#Pieces [wikipedia.org]
(Score: 5, Interesting) by bzipitidoo on Wednesday January 04, @04:54PM (2 children)
Wow, I haven't played Stratego since I was a kid. It was fun, but never thought of it as a game worthy of study on the same level as chess. I particularly enjoyed the setup, somewhat analogous to deck building in all these tradable card games that started coming out in the 1990s.
We tried a lot of strategies. Typical was surrounding your flag with bombs. Usually put the flag on the edge or the corner, so that it took only 3 or just 2 bombs. Often, the corners were too obvious, or more like too few, and we'd go with a 3 bomb set up, to make the flag's location harder to figure. This included the spaces adjacent to the lakes, for the surprise value. Was daring to put your flag in the front rank like that. Might put the other 3 bombs around a 7 piece, to kill the 8 when your opponent found and diffused one, and then tried for what was thought to be your flag. A few times I tried 4 bombs around the flag somewhere in the middle, but that didn't work too well. Sometimes, of course, you'd make the patterns, and not have the flag inside any of them. Another experiment was using 4 bombs to surround two edge spots. Or, you might use 3 bombs to block off a corner and the 2 adjacent squares. You could even shield a flag in the corner with two layers of bombs, with a layer of 8 killers in between. Took all 6 bombs, of course, and was such a large block of immovable pieces that your opponent would soon guess what your game was.
Another idea was to block one or more of the 3 passages with bombs--usually on the side where you'd put your flag, to make it a longer walk for your opponent. But then you needed something to stop the 8's from immediately diffusing those bombs on the front line. if you put a 7 out there to defend the bombs, your opponent would find that out and bring in a 6, then an 8. So you might put a 5 in as the bomb defender. Ultimately, I found that shutting one of your pieces out, to guard the bomb barrier, never worked that well. Too inflexible. Variety was the key. Wanted a couple of 9s already aimed at the passages, to exploit a chance to zip deep into enemy territory to test if a suspicious piece was the flag, an 8 for bomb diffusing, and a mix of higher ranking pieces to deal with whatever your opponent might have set up. The spy seemed best somewhere in the middle, so that when you found out where the opponent's 1 piece was, it was a shorter walk to get the spy nearby.
For high ranking pieces, I liked the 2 the best, the most powerful piece that was not vulnerable to the spy. If I managed to get the opponent's spy, then that made the 1 the obviously superior piece. Getting the opponent's 1 with your spy, and eliminating your opponent's spy without losing your 1, was tantamount to winning. But you couldn't focus too much on engineering that, had to defend your flag, or you could still lose, of course.
Another idea was trying to kill all your opponent's 8s, so that your bombs could not be diffused. If you pulled that off, and you'd set up so that your flag was blocked off by bombs, you were guaranteed at least a draw.
The game had a memory element to it. It was crucial to remember the ranks of the pieces you'd paid to expose, by losing a lower ranking piece to learn that info. Of course AI would have no problem whatsoever with that aspect, but kids, we'd sometimes forget, lose track, and have to spend another piece to check again.
(Score: 1, Interesting) by Anonymous Coward on Wednesday January 04, @05:30PM (1 child)
If you haven't clicked through and read the article, you might want to. You might find it interesting because it goes into the different strategies the AI used.
(Score: 1) by crotherm on Thursday January 05, @05:01PM
Stratego was played often in my house growing up. I have five older brothers who all played along with friends. Those strategies mentioned were well known. Basically, the whole game is bluffing. Playing it straight could confuse a salty player. :) BTW, you should have seen the Risk games in my house. crazy.
(Score: 3, Interesting) by Fnord666 on Thursday January 05, @02:00AM (2 children)
(Score: 3, Interesting) by bzipitidoo on Thursday January 05, @07:21PM (1 child)
Oh, well, Battleship I think has a fairly intuitive optimal strategy. Shell in a pattern of diagonal lines. Of course, just which lines is the question. If you space the lines so that the smallest ship cannot hide, you'll eventually get it, but that may take too long. Space the lines 2 apart, and you have a 2/3rds chance of hitting the littlest ship, sooner. Space them 3 apart, and that chance drops to 50%, but it's even faster, and if you miss, you can then test the area in between optimally. 3 spaces apart has another disadvantage, it can also miss the next smaller ships. I have never sat down and worked out which spacing is best, but it doesn't seem that hard to do. Beyond that, which parts of the lines you try first, I see as being pure intuition and guesswork. Do you think your opponent is going to try to hide on the edges, or in the middle?
Anyway, I think the day is near if not here already that AI can play all these games well. Pretty much all fun games are simple compared to reality. Bots already abound. The chief problem with having a bot play MMORPGs are not the games themselves, it's all the countermeasures designed specifically to detect and ban the bots. Another article mentioned using AI to write programs. If AI can do that and do it well, I cannot imagine any current mechanistic game presenting a challenge. Though MMORPGs clearly aren't beyond AI, perhaps table top RPGs still are. MMORPGS are at heart the mechanics of RPGs, and to this day fail to capture the essence of role playing. Sports, now, could still be a challenge. Possibly robot players can already be pretty good at any sport. Some kinds, the simple track and field stuff, were already dominated years ago by mere brute mechanical devices possessing no intelligence whatsoever. Everyone realizes no animal can come anywhere close to winning, for instance, a marathon against a typical automobile, and the contest isn't even tried as it would be very boring, the outcome preordained. Sports, I guess, needs to evolve away from simplistic dumb races, and we've been seeing that in extreme sports.
(Score: 2) by bart9h on Friday January 06, @07:37PM
Somewhat similar to strip-mining in Minecraft.