IBM researchers use analog memory to train deep neural networks faster and more efficiently
Deep neural networks normally require fast, powerful graphical processing unit (GPU) hardware accelerators to support the needed high speed and computational accuracy — such as the GPU devices used in the just-announced Summit supercomputer. But GPUs are highly energy-intensive, making their use expensive and limiting their future growth, the researchers explain in a recent paper published in Nature.
Instead, the IBM researchers used large arrays of non-volatile analog memory devices (which use continuously variable signals rather than binary 0s and 1s) to perform computations. Those arrays allowed the researchers to create, in hardware, the same scale and precision of AI calculations that are achieved by more energy-intensive systems in software, but running hundreds of times faster and at hundreds of times lower power — without sacrificing the ability to create deep learning systems.
The trick was to replace conventional von Neumann architecture, which is "constrained by the time and energy spent moving data back and forth between the memory and the processor (the 'von Neumann bottleneck')," the researchers explain in the paper. "By contrast, in a non-von Neumann scheme, computing is done at the location of the data [in memory], with the strengths of the synaptic connections (the 'weights') stored and adjusted directly in memory.
Equivalent-accuracy accelerated neural-network training using analogue memory (DOI: 10.1038/s41586-018-0180-5) (DX)
(Score: 2) by frojack on Thursday June 21 2018, @03:54AM (3 children)
Yet, unless you are dealing in the noise, and doing computations N+6 places beyond the decimal point when your data is only accurate to N places, you just never find this stuff in actual use cases.
No, you are mistaken. I've always had this sig.
(Score: 4, Insightful) by c0lo on Thursday June 21 2018, @04:14AM (2 children)
Never? Mate, even the simple use-case of a N-body problem with small N values like those launching a satellite towards an asteroid will take those into account and provide for a way of correcting the trajectory of that satellite.
Weather simulations? Rife with accumulating rounding errors, need to take them in consideration (e.g. by running those simulations a good number of time and treating the obtained results statistically).
(What's with you, the conservative people, that you are so in-love with absolutes? What's wrong with you, is it so hard to accept your human fallibility and the fact you can be wrong?)
https://www.youtube.com/watch?v=aoFiw2jMy-0 https://soylentnews.org/~MichaelDavidCrawford
(Score: 1) by ChrisMaple on Thursday June 21 2018, @07:39PM (1 child)
On a digital computer, with the same inputs, using the same program, with care taken to assure the same initial state, if the results are not always the same then the computer is defective. Rounding errors are always the same: they are deterministic.
(Score: 2) by c0lo on Thursday June 21 2018, @10:54PM
False. Multithreading will easily break the assertion above without the computer being defective.
https://www.youtube.com/watch?v=aoFiw2jMy-0 https://soylentnews.org/~MichaelDavidCrawford