Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Thursday March 23 2017, @07:23PM   Printer-friendly
from the big.Little-just-couldn't-decide dept.

ARM will replace the big.LITTLE cluster design with a new one that allows up to 8 CPU cores per cluster, different types of cores within a cluster, and anywhere from one to many (unlimited?) clusters:

The first stage of DynamIQ is a larger cluster paradigm - which means up to eight cores per cluster. But in a twist, there can be a variable core design within a cluster. Those eight cores could be different cores entirely, from different ARM Cortex-A families in different configurations.

Many questions come up here, such as how the cache hierarchy will allow threads to migrate between cores within a cluster (perhaps similar to how threads migrate between clusters on big.Little today), even when cores have different cache arrangements. ARM did not yet go into that level of detail, however we were told that more information will be provided in the coming months.

Each variable core-configuration cluster will be a part of a new fabric, with uses additional power saving modes and aims to provide much lower latency. The underlying design also allows each core to be controlled independently for voltage and frequency, as well as sleep states. Based on the slide diagrams, various other IP blocks, such as accelerators, should be able to be plugged into this fabric and benefit from that low latency. ARM quoted elements such as safety critical automotive decisions can benefit from this.

A tri-cluster smartphone design using 2 high-end cores, 2 mid-level cores, and 4 low-power cores could be replaced by one that uses all three types of core in the same single cluster. The advantage of that approach remains to be seen.

More about ARM big.LITTLE.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2, Disagree) by DannyB on Thursday March 23 2017, @08:55PM (3 children)

    by DannyB (5839) Subscriber Badge on Thursday March 23 2017, @08:55PM (#483395) Journal

    Don't think about language first. Think in terms of Map and Reduce frameworks. Look for problems that can be decomposed into much smaller pieces. Ideally the class of problems that are "embarrassingly parallel". For example, computing each pixel of an image of the Mandelbrot set. Or any other pixel computation where each pixel is computed independently of its neighbors. (eg, 3D rendering, many photoshop filters, maybe video encode / decode) Or if the setup/teardown overhead is too high for computing a single pixel, then divide the image into blocks of pixels that are computed iteratively. Example, break a 4096x4096 image into 256x256 pixel blocks, treat each block as a fundamental problem element.

    Don't look at C / C++. Look at higher level languages. Examples: Clojure, Erlang, and others. The overhead of the runtime substrates for the higher level languages is not as important as the need to be able to easily scale the problem by simply throwing more cpu's at it. If my high level language solution is ten times slower than your C++ code, but I can trivially just throw more cpu's at my solution to scale up to any level I please, then I have a winner. You shouldn't even be thinking in terms of C++ thread models. Think "forbidden planet": that machine is going to provide whatever amount of power your monster needs to achieve its goals.

    The problem needs to be decomposable into small parts. Ideally very small, hence "embarrassingly parallel" like many pixel based problems. As a counter example, I would offer the problem Spock asked of the Enterprise computer: compute to the last digit the absolute value of Pi. The idea was that more and more "banks" of the computer would work on the problem until capacity was exhausted. I have difficulty imaging how this particular problem could achieve that, since it is not obvious to me how the computation of an infinitely long Pi could be done in parallel.

    --
    The lower I set my standards the more accomplishments I have.
    Starting Score:    1  point
    Moderation   0  
       Disagree=1, Total=1
    Extra 'Disagree' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 2, Interesting) by Anonymous Coward on Thursday March 23 2017, @09:05PM (1 child)

    by Anonymous Coward on Thursday March 23 2017, @09:05PM (#483399)

    The understanding of the details on which are built those high-level frameworks is slowly being locked away in the in the walled gardens of giant corporations. The only way to solve problems will be to do so within their set of concepts, because it will be too complex to reverse-engineer just what's going on.

    The user is being pushed to increasingly higher levels of abstraction (as you note), which are attached to reality through carefully guarded industry secrets. The world of computing is ever more magical.

    • (Score: 2) by Scruffy Beard 2 on Friday March 24 2017, @08:46AM

      by Scruffy Beard 2 (6030) on Friday March 24 2017, @08:46AM (#483570)

      I have an idle long-term plan for that: Build an auditable computer from scratch. Would probably take decades though.

      It would involve fuse ROMs programmed through CRC protected toggle switches. Then using those ROMs to build periperals like keyboards and monitors that you can trust.

      Would involve code correctness proofs as well. I am hoping that as complexity goes up, the formal proofs will greatly reduce debugging time.

      Goes off to start dreaming for reals.

  • (Score: 2) by jmorris on Friday March 24 2017, @01:38AM

    by jmorris (4844) on Friday March 24 2017, @01:38AM (#483470)

    Which is great if your program only need run a few times. Otherwise if it runs ten times slower it requires ten times the electricity and ten times the data center capacity. So many people make that mistake, deploying scripting and other toy/fad/academic languages into production and only when the crunch time comes realize that the survival of the company now depends on replacing that hot mess with real code before the hosting bills from chasing the load bankrupts them or starts chasing off the users who are finally swarming in with error messages. Remember the fail whale; don't be Jack. They barely survived the mistake.