SoylentNews Comments | ARM's DynamIQ Introduces Variable Core-Configuration Clusters

ARM's DynamIQ Introduces Variable Core-Configuration Clusters

posted by Fnord666 on Thursday March 23 2017, @07:23PM

from the big.Little-just-couldn't-decide dept.

ARM will replace the big.LITTLE cluster design with a new one that allows up to 8 CPU cores per cluster, different types of cores within a cluster, and anywhere from one to many (unlimited?) clusters:

The first stage of DynamIQ is a larger cluster paradigm - which means up to eight cores per cluster. But in a twist, there can be a variable core design within a cluster. Those eight cores could be different cores entirely, from different ARM Cortex-A families in different configurations.
Many questions come up here, such as how the cache hierarchy will allow threads to migrate between cores within a cluster (perhaps similar to how threads migrate between clusters on big.Little today), even when cores have different cache arrangements. ARM did not yet go into that level of detail, however we were told that more information will be provided in the coming months.
Each variable core-configuration cluster will be a part of a new fabric, with uses additional power saving modes and aims to provide much lower latency. The underlying design also allows each core to be controlled independently for voltage and frequency, as well as sleep states. Based on the slide diagrams, various other IP blocks, such as accelerators, should be able to be plugged into this fabric and benefit from that low latency. ARM quoted elements such as safety critical automotive decisions can benefit from this.

A tri-cluster smartphone design using 2 high-end cores, 2 mid-level cores, and 4 low-power cores could be replaced by one that uses all three types of core in the same single cluster. The advantage of that approach remains to be seen.

More about ARM big.LITTLE.

Original Submission

Starting Score:

point

Moderation

Disagree=1, Total=1

Extra 'Disagree' Modifier

Karma-Bonus Modifier

Total Score:

This discussion has been archived. No new comments can be posted.

ARM's DynamIQ Introduces Variable Core-Configuration Clusters | Log In/Create an Account | Top | 20 comments | Search Discussion

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.

Re:More like a CLUSTER FUCK Re:More like a CLUSTER FUCK (Score: 2, Disagree) by DannyB on Thursday March 23 2017, @08:55PM (3 children)

by DannyB (5839)

on Thursday March 23 2017, @08:55PM (#483395) Journal

Don't think about language first. Think in terms of Map and Reduce frameworks. Look for problems that can be decomposed into much smaller pieces. Ideally the class of problems that are "embarrassingly parallel". For example, computing each pixel of an image of the Mandelbrot set. Or any other pixel computation where each pixel is computed independently of its neighbors. (eg, 3D rendering, many photoshop filters, maybe video encode / decode) Or if the setup/teardown overhead is too high for computing a single pixel, then divide the image into blocks of pixels that are computed iteratively. Example, break a 4096x4096 image into 256x256 pixel blocks, treat each block as a fundamental problem element.

Don't look at C / C++. Look at higher level languages. Examples: Clojure, Erlang, and others. The overhead of the runtime substrates for the higher level languages is not as important as the need to be able to easily scale the problem by simply throwing more cpu's at it. If my high level language solution is ten times slower than your C++ code, but I can trivially just throw more cpu's at my solution to scale up to any level I please, then I have a winner. You shouldn't even be thinking in terms of C++ thread models. Think "forbidden planet": that machine is going to provide whatever amount of power your monster needs to achieve its goals.

The problem needs to be decomposable into small parts. Ideally very small, hence "embarrassingly parallel" like many pixel based problems. As a counter example, I would offer the problem Spock asked of the Enterprise computer: compute to the last digit the absolute value of Pi. The idea was that more and more "banks" of the computer would work on the problem until capacity was exhausted. I have difficulty imaging how this particular problem could achieve that, since it is not obvious to me how the computation of an infinitely long Pi could be done in parallel.

--
The lower I set my standards the more accomplishments I have.

Parent

Starting Score:	1		point
Moderation		0
Disagree=1, Total=1
Extra 'Disagree' Modifier		0
Karma-Bonus Modifier		+1

Total Score:		2

You're not wrong, but you're missing the point. You're not wrong, but you're missing the point. (Score: 2, Interesting) by Anonymous Coward on Thursday March 23 2017, @09:05PM (1 child)

by Anonymous Coward on Thursday March 23 2017, @09:05PM (#483399)

The understanding of the details on which are built those high-level frameworks is slowly being locked away in the in the walled gardens of giant corporations. The only way to solve problems will be to do so within their set of concepts, because it will be too complex to reverse-engineer just what's going on.
The user is being pushed to increasingly higher levels of abstraction (as you note), which are attached to reality through carefully guarded industry secrets. The world of computing is ever more magical.

Parent
- Re:You're not wrong, but you're missing the point. (Score: 2) by Scruffy Beard 2 on Friday March 24 2017, @08:46AM
  
  by Scruffy Beard 2 (6030) on Friday March 24 2017, @08:46AM (#483570)
  
  I have an idle long-term plan for that: Build an auditable computer from scratch. Would probably take decades though.
  It would involve fuse ROMs programmed through CRC protected toggle switches. Then using those ROMs to build periperals like keyboards and monitors that you can trust.
  Would involve code correctness proofs as well. I am hoping that as complexity goes up, the formal proofs will greatly reduce debugging time.
  Goes off to start dreaming for reals.
  
  Parent
Re:More like a CLUSTER FUCK (Score: 2) by jmorris on Friday March 24 2017, @01:38AM

by jmorris (4844) on Friday March 24 2017, @01:38AM (#483470)

Which is great if your program only need run a few times. Otherwise if it runs ten times slower it requires ten times the electricity and ten times the data center capacity. So many people make that mistake, deploying scripting and other toy/fad/academic languages into production and only when the crunch time comes realize that the survival of the company now depends on replacing that hot mess with real code before the hosting bills from chasing the load bankrupts them or starts chasing off the users who are finally swarming in with error messages. Remember the fail whale; don't be Jack. They barely survived the mistake.

Parent

Moderator Help

SoylentNews

SoylentNews is people

Navigation

Sections

SoylentNews

ARM's DynamIQ Introduces Variable Core-Configuration Clusters

Re:More like a CLUSTER FUCK Re:More like a CLUSTER FUCK (Score: 2, Disagree) by DannyB on Thursday March 23 2017, @08:55PM (3 children)

You're not wrong, but you're missing the point. You're not wrong, but you're missing the point. (Score: 2, Interesting) by Anonymous Coward on Thursday March 23 2017, @09:05PM (1 child)

Re:You're not wrong, but you're missing the point. (Score: 2) by Scruffy Beard 2 on Friday March 24 2017, @08:46AM

Re:More like a CLUSTER FUCK (Score: 2) by jmorris on Friday March 24 2017, @01:38AM