Stories
Slash Boxes
Comments

SoylentNews is people

posted by hubie on Wednesday October 12 2022, @12:24AM   Printer-friendly

DeepMind unveils first AI to discover faster matrix multiplication algorithms:

Can artificial intelligence (AI) create its own algorithms to speed up matrix multiplication, one of machine learning’s most fundamental tasks? Today, in a paper published in Nature, DeepMind unveiled AlphaTensor, the “first artificial intelligence system for discovering novel, efficient and provably correct algorithms.” The Google-owned lab said the research “sheds light” on a 50-year-old open question in mathematics about finding the fastest way to multiply two matrices.

Ever since the Strassen algorithm was published in 1969, computer science has been on a quest to surpass its speed of multiplying two matrices. While matrix multiplication is one of algebra’s simplest operations, taught in high school math, it is also one of the most fundamental computational tasks and, as it turns out, one of the core mathematical operations in today’s neural networks.

[...] This research delves into how AI could be used to improve computer science itself, said Pushmeet Kohli, head of AI for science at DeepMind, at a press briefing.

“If we’re able to use AI to find new algorithms for fundamental computational tasks, this has enormous potential because we might be able to go beyond the algorithms that are currently used, which could lead to improved efficiency,” he said.

This is a particularly challenging task, he explained, because the process of discovering new algorithms is so difficult, and automating algorithmic discovery using AI requires a long and difficult reasoning process — from forming intuition about the algorithmic problem to actually writing a novel algorithm and proving that the algorithm is correct on specific instances.

“This is a difficult set of steps and AI has not been very good at that so far,” he said.

[...] According to DeepMind, AlphaTensor discovered algorithms that are more efficient than the state of the art for many matrix sizes and outperform human-designed ones.

AlphaTensor begins without any knowledge about the problem, Kohli explained, and then gradually learns what is happening and improves over time. “It first finds this classroom algorithm that we were taught, and then it finds historical algorithms such as Strassen’s and then at some point, it surpasses them and discovers completely new algorithms that are faster than previously.”

Kohli said he hopes that this paper inspires others in using AI to guide algorithmic discovery for other fundamental competition tasks. “We think this is a major step in our path towards really using AI for algorithmic discovery,” he said.

What other new algorithms will be discovered? I wonder when they will attempt to apply this to factorization?


Original Submission

 
This discussion was created by hubie (1068) for logged-in users only, but now has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Informative) by stormwyrm on Wednesday October 12 2022, @03:55AM (1 child)

    by stormwyrm (717) on Wednesday October 12 2022, @03:55AM (#1276184) Journal

    Absolute hogwash when we're talking about ordinary everyday systems that most often arise in practice, especially those that use integer and floating point data types that are provided by modern hardware. Even moving data around is not free, and these other algorithms do that way more than the basic one. Strassen's algorithm for instance is actually counterproductive for matrices that are less than around 1000 in size because of the algorithm's substantial overhead. Even for such larger matrices performance gains are marginal at around 10% improvement at best over the basic matrix multiplication algorithm. It also incurs a cost in increased memory usage (meaning in addition to the eight megabytes each of your 1000×1000 double precision floating point matrices already consumes you'll need several times more to multiply them), and for floating point implementations, decreased numerical stability. The Coppersmith–Winograd algorithm and this new one developed at DeepMind are likely even worse in this respect. There is a sizeable constant factor in the performance of these "better" algorithms that is hidden by big-O notation, that make them not worth using except in highly exceptional circumstances.

    --
    Numquam ponenda est pluralitas sine necessitate.
    Starting Score:    1  point
    Moderation   +1  
       Informative=1, Total=1
    Extra 'Informative' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3  
  • (Score: 1) by khallow on Thursday October 13 2022, @03:22PM

    by khallow (3766) Subscriber Badge on Thursday October 13 2022, @03:22PM (#1276447) Journal

    Strassen's algorithm for instance is actually counterproductive for matrices that are less than around 1000 in size because of the algorithm's substantial overhead.

    You're also assuming the existence of fast vector operations which a lot of CPUs have which greatly speed up normal matrix multiplication. If they don't, it achieves its advantage at much smaller matrix size.