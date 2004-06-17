Over the years, scientists have worked on algorithms for curiosity, but copying human inquisitiveness has been tricky. For example, most methods aren't capable of assessing artificial agents' gaps in knowledge to predict what will be interesting before they see it. (Humans can sometimes judge how interesting a book will be by its cover.)

Todd Hester, a computer scientist currently at Google DeepMind in London hoped to do better. "I was looking for ways to make computers learn more intelligently, and explore as a human would," he says. "Don't explore everything, and don't explore randomly, but try to do something a little smarter."

So Hester and Peter Stone, a computer scientist at the University of Texas in Austin, developed a new algorithm, Targeted Exploration with Variance-And-Novelty-Intrinsic-Rewards (TEXPLORE-VENIR), that relies on a technique called reinforcement learning. In reinforcement learning, a program tries something, and if the move brings it closer to some ultimate goal, such as the end of a maze, it receives a small reward and is more likely to try the maneuver again in the future. DeepMind has used reinforcement learning to allow programs to master Atari games and the board game Go through random experimentation. But TEXPLORE-VENIR, like other curiosity algorithms, also sets an internal goal for which the program rewards itself for comprehending something new, even if the knowledge doesn't get it closer to the ultimate goal.