Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Tuesday August 23 2016, @12:41AM   Printer-friendly
from the we've-come-a-long-way-since-the-8087 dept.

ARM licensees will be able to operate on 16 times more information per vector per instruction than before, using ARM's new Scalable Vector Extensions:

Today ARM is announcing an update to their line of architecture license products. With the goal of moving ARM more into the server, the data center, and high-performance computing, the new license add-on tackles a fundamental data center and HPC issue: vector compute. ARM v8-A with Scalable Vector Extensions won't be part of any ARM microarchitecture license today, but for the semiconductor companies that build their own cores with the instruction set, this could see ARM move up into the HPC markets. Fujitsu is the first public licensee on board, with plans to include ARM v8-A cores with SVE in the Post-K RIKEN supercomputer in 2020.

Scalable Vector Extensions (SVE) will be a flexible addition to the ISA, and support from 128-bit to 2048-bit. ARM has included the extensions in a way that if included in the hardware, the hardware is scalable: it doesn't matter if the code being run calls for 128-bit, 512-bit or 2048-bit, the scheduler will arrange the calculations to compensate for the hardware that is available. Thus a 2048-bit code run on a 128-bit SVE core will manage the instructions in such a way to complete the calculation, or a 128-bit code on a 2048-bit core will attempt to improve IPC by bundling 128-bit calculations together. ARM's purpose here is to move the vector calculation problem away from software and into hardware.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by NCommander on Tuesday August 23 2016, @11:32AM

    by NCommander (2) Subscriber Badge <michael@casadevall.pro> on Tuesday August 23 2016, @11:32AM (#392072) Homepage Journal

    AltiVec on PowerPC was mostly disused because its difficult to create a compiler that can auto-vectorize C code. Itanium basically suffered this problem on an architecutal basis since the entire thing is basically vector processing.

    --
    Still always moving
    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 2) by RamiK on Tuesday August 23 2016, @02:32PM

    by RamiK (1813) on Tuesday August 23 2016, @02:32PM (#392141)

    Worth linking: http://llvm.org/docs/Vectorizers.html [llvm.org] & https://gcc.gnu.org/projects/tree-ssa/vectorization.html [gnu.org]

    For java, Intel pushed some vectorization support to hotspot (java) and there's still ongoing work. But the closed source Oracle stack is said to have decent & well rounded auto-vectorization.

    So, if I had to guess, these features are almost exclusively enterprise Java oriented. It's possible they even have specific applications in mind for this. Possibly in the financial sector...

    --
    compiling...
  • (Score: 2) by FatPhil on Tuesday August 23 2016, @06:19PM

    by FatPhil (863) <pc-soylentNO@SPAMasdf.fi> on Tuesday August 23 2016, @06:19PM (#392225) Homepage
    Altivec was no harder to software-vectorise than SSE* were. SSE* succeeded in the marketplace, therefore automatic vectorisation was not the issue.

    Itanium was just VLIW, and so required heterogenous operation parallelism rather than the homogenous operation parallelism of vectorising, which is a slighlty easier problem, as you can mix-and-match what goes together (any input, any output, any mathematical calculation, any address generation, etc.), vectorising requires an exact match, and to be a win requires several exact matches. Itanium failed in the market place because they thought the best way to beat AMD was to start playing a completely different game, but when they realised that x86 support was necessary, they added support which was utterly lousy (way worse than DEC's Fx/32 x86 emulation on 2-generation-old chips).

    This new "advance" is just going back to the x86_64 throw-tons-of-gates-on-the-chip logic that VLIW was in part a reaction against.

    There are some who think that just makes hot expensive-to-cool server farms, and that something like the VLIW attitude of doing almost all of the scheduling decisions in the compiler is the way to go. The ones with the most groundbreaking ideas in that direction are the boffins behind the "Mill" architecture, which has decided to just reinvent *everything* from scratch, and therefore, if they ever reach tape-out will be the only really revolutionary computer architecture in about 4 decades. Something like 13 vids are now up on youtube, they're all worth a watch.
    --
    Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves