Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Thursday March 23 2017, @07:23PM   Printer-friendly
from the big.Little-just-couldn't-decide dept.

ARM will replace the big.LITTLE cluster design with a new one that allows up to 8 CPU cores per cluster, different types of cores within a cluster, and anywhere from one to many (unlimited?) clusters:

The first stage of DynamIQ is a larger cluster paradigm - which means up to eight cores per cluster. But in a twist, there can be a variable core design within a cluster. Those eight cores could be different cores entirely, from different ARM Cortex-A families in different configurations.

Many questions come up here, such as how the cache hierarchy will allow threads to migrate between cores within a cluster (perhaps similar to how threads migrate between clusters on big.Little today), even when cores have different cache arrangements. ARM did not yet go into that level of detail, however we were told that more information will be provided in the coming months.

Each variable core-configuration cluster will be a part of a new fabric, with uses additional power saving modes and aims to provide much lower latency. The underlying design also allows each core to be controlled independently for voltage and frequency, as well as sleep states. Based on the slide diagrams, various other IP blocks, such as accelerators, should be able to be plugged into this fabric and benefit from that low latency. ARM quoted elements such as safety critical automotive decisions can benefit from this.

A tri-cluster smartphone design using 2 high-end cores, 2 mid-level cores, and 4 low-power cores could be replaced by one that uses all three types of core in the same single cluster. The advantage of that approach remains to be seen.

More about ARM big.LITTLE.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 2) by Snotnose on Thursday March 23 2017, @07:40PM (4 children)

    by Snotnose (1623) on Thursday March 23 2017, @07:40PM (#483360)

    Their high end chips had an old ARM (7?) for low level stuff, an ARM9 driving the phone, and an ARM11 for apps. All on 1 chunk of silicon. Haven't been there in 7-8 years, dunno what's in their new chips.

    --
    If you're talking about me behind my back, remember you're in a great position to kiss my ass.
    • (Score: 2) by FatPhil on Thursday March 23 2017, @07:44PM (2 children)

      by FatPhil (863) <{pc-soylent} {at} {asdf.fi}> on Thursday March 23 2017, @07:44PM (#483366) Homepage
      If it was a phone SoC, it probably had up to 7 arm cores on it. The modem would have had an arm, the wi-fi would also have had an arm too, ...
      --
      If vaccination works, then why doesn't eucharist protect kids against Christianity?
      • (Score: 2) by DannyB on Thursday March 23 2017, @08:24PM (1 child)

        by DannyB (5839) Subscriber Badge on Thursday March 23 2017, @08:24PM (#483380)

        That design does not seem very generalized.

        The workload should be able to shift to different cores if a significant number of ARM cores are destroyed within a Borg vessel.

        --
        ALL LIABILITY IS EXPRESSLY DISCLAIMED FOR PERSONAL INJURY OR DEATH THAT RESULTS FROM READING THE SOURCE CODE.
        • (Score: 4, Informative) by jmorris on Friday March 24 2017, @01:27AM

          by jmorris (4844) <jmorrisNO@SPAMbeau.org> on Friday March 24 2017, @01:27AM (#483466)

          The radio is always kept isolated to prevent the insecure Android side from possibly being able to get at the physical interface of the radio. Putting an entirely separate CPU, RAM and FLASH, often with only a serial link to the main CPU is secure, especially since the radio processor tends to only boot signed images. Some get cheap and use the newer ARM cpu's ability to partition off a really secure section of memory and run in a super better than ring 0 mode but I bet the FCC doesn't like it and makes that known. The Wifi is the same way, dedicated signed firmware on a dedicated CPU, usually connected by an internal USB link. Because those radios are basically a software defined radio that is physically capable of all sorts of fun things... IF we could get our hands on them. The FCC ain't having none of that.

          It really is crazy how many processing units a phone can stuff in. My old crappy Tegra3 based phone has four fast ARM cores, one slow ARM core, one ARM "AVP" core as a co-processor (it is actually the boot processor and does the secure boot stuff and starts the main one, then idles with it's own dedicated 256K block of static ram to help (along with yet another undocumented specialty processing unit) play media files and do sleep/wake, etc. Then there is a crypto processor that NVidia won't document in the tech manual, a couple of GPU cores, the radio is on an entirely different chip made by Intel with dedicated ram/flash. Same for BT, GPS and NFC, they have a small CPU in them, type unknown and there is even a little one in the SIM card. It truly is amazing the computing plenty we take for granted.

    • (Score: 2) by Hairyfeet on Friday March 24 2017, @07:34AM

      by Hairyfeet (75) <reversethis-{moc ... {8691tsaebssab}> on Friday March 24 2017, @07:34AM (#483555) Journal

      I think they are still making those, they are used in a couple of the $150-$200 BLU phones I've been looking at as well as some Alcatel One Touch models. IIRC the new ones are octocores and have 2 of the ARM 7s for low power tasks like checking email with the screen off, 2 of the ARM 9 for phone tasks, and 4 of the ARM 11s for the apps. Pretty impressive if you ask me.

      --
      ACs are never seen so don't bother. Always ready to show SJWs for the racists they are.
  • (Score: 0) by Anonymous Coward on Thursday March 23 2017, @07:41PM (6 children)

    by Anonymous Coward on Thursday March 23 2017, @07:41PM (#483363)

    General-purpose programming languages cannot keep up with these special-purpose designs; have you read the latest C++/C thread model? It makes no sense! It's as understandable as Quantum Mechanics.

    Add to this increasingly complexity the looming specter of proprietary computing "fabrics", and there's just no room for anything but massive, monopolistic, top-down, dictatorial, walled-garden, magical corporate overlording.

    Personal Computing is dead.

    • (Score: 3, Funny) by LoRdTAW on Thursday March 23 2017, @07:53PM

      by LoRdTAW (3755) Subscriber Badge on Thursday March 23 2017, @07:53PM (#483370) Journal

      You gotta learn to app! Only apps matter. You also need to let go of control and also IoT more.

    • (Score: 2, Disagree) by DannyB on Thursday March 23 2017, @08:55PM (3 children)

      by DannyB (5839) Subscriber Badge on Thursday March 23 2017, @08:55PM (#483395)

      Don't think about language first. Think in terms of Map and Reduce frameworks. Look for problems that can be decomposed into much smaller pieces. Ideally the class of problems that are "embarrassingly parallel". For example, computing each pixel of an image of the Mandelbrot set. Or any other pixel computation where each pixel is computed independently of its neighbors. (eg, 3D rendering, many photoshop filters, maybe video encode / decode) Or if the setup/teardown overhead is too high for computing a single pixel, then divide the image into blocks of pixels that are computed iteratively. Example, break a 4096x4096 image into 256x256 pixel blocks, treat each block as a fundamental problem element.

      Don't look at C / C++. Look at higher level languages. Examples: Clojure, Erlang, and others. The overhead of the runtime substrates for the higher level languages is not as important as the need to be able to easily scale the problem by simply throwing more cpu's at it. If my high level language solution is ten times slower than your C++ code, but I can trivially just throw more cpu's at my solution to scale up to any level I please, then I have a winner. You shouldn't even be thinking in terms of C++ thread models. Think "forbidden planet": that machine is going to provide whatever amount of power your monster needs to achieve its goals.

      The problem needs to be decomposable into small parts. Ideally very small, hence "embarrassingly parallel" like many pixel based problems. As a counter example, I would offer the problem Spock asked of the Enterprise computer: compute to the last digit the absolute value of Pi. The idea was that more and more "banks" of the computer would work on the problem until capacity was exhausted. I have difficulty imaging how this particular problem could achieve that, since it is not obvious to me how the computation of an infinitely long Pi could be done in parallel.

      --
      ALL LIABILITY IS EXPRESSLY DISCLAIMED FOR PERSONAL INJURY OR DEATH THAT RESULTS FROM READING THE SOURCE CODE.
      • (Score: 2, Interesting) by Anonymous Coward on Thursday March 23 2017, @09:05PM (1 child)

        by Anonymous Coward on Thursday March 23 2017, @09:05PM (#483399)

        The understanding of the details on which are built those high-level frameworks is slowly being locked away in the in the walled gardens of giant corporations. The only way to solve problems will be to do so within their set of concepts, because it will be too complex to reverse-engineer just what's going on.

        The user is being pushed to increasingly higher levels of abstraction (as you note), which are attached to reality through carefully guarded industry secrets. The world of computing is ever more magical.

        • (Score: 2) by Scruffy Beard 2 on Friday March 24 2017, @08:46AM

          by Scruffy Beard 2 (6030) on Friday March 24 2017, @08:46AM (#483570)

          I have an idle long-term plan for that: Build an auditable computer from scratch. Would probably take decades though.

          It would involve fuse ROMs programmed through CRC protected toggle switches. Then using those ROMs to build periperals like keyboards and monitors that you can trust.

          Would involve code correctness proofs as well. I am hoping that as complexity goes up, the formal proofs will greatly reduce debugging time.

          Goes off to start dreaming for reals.

      • (Score: 2) by jmorris on Friday March 24 2017, @01:38AM

        by jmorris (4844) <jmorrisNO@SPAMbeau.org> on Friday March 24 2017, @01:38AM (#483470)

        Which is great if your program only need run a few times. Otherwise if it runs ten times slower it requires ten times the electricity and ten times the data center capacity. So many people make that mistake, deploying scripting and other toy/fad/academic languages into production and only when the crunch time comes realize that the survival of the company now depends on replacing that hot mess with real code before the hosting bills from chasing the load bankrupts them or starts chasing off the users who are finally swarming in with error messages. Remember the fail whale; don't be Jack. They barely survived the mistake.

    • (Score: 2) by Bot on Friday March 24 2017, @06:35AM

      by Bot (3902) Subscriber Badge on Friday March 24 2017, @06:35AM (#483543)

      If you think that's bad, consider what will do systemd on that platform. On odroid (arm) systemd is not able to systematically close the network connection before rebooting, so the ssh client is left waiting half of the time.

      The good news is that if you are into genetic algorithms you can write javascript on chrome on systemd on dynamicIQ and let the platform itself evolve the code. Skynet has to start somewhere.

  • (Score: 1, Redundant) by FatPhil on Thursday March 23 2017, @07:57PM (3 children)

    by FatPhil (863) <{pc-soylent} {at} {asdf.fi}> on Thursday March 23 2017, @07:57PM (#483372) Homepage
    My understanding, from people who do little things like maintain (as in the real official maintainers of) the linux kernel for arm-based chip families, and power-management subsystems, is that big.LITTLE isn't even fully working yet, despite it being several years old, and that the more complicated designs (such as min.med.max, as covered here a couple of weeks ago) have no power-saving benefit over a big.LITTLE design.

    Previous slideware not delivering what was promised, therefore introduce new slideware with even bigger promises?

    ARM ain't what they used to be a half a decade or so ago, they're turning into proper bullshit artists.
    --
    If vaccination works, then why doesn't eucharist protect kids against Christianity?
    • (Score: 2) by bob_super on Thursday March 23 2017, @08:52PM (2 children)

      by bob_super (1357) on Thursday March 23 2017, @08:52PM (#483389)

      If you do industrial embedded designs, you customize your tasks and scheduler to properly use the right cores. BIG.little and similar schemes are great.

      If you do general computing, good luck figuring out the universal rule for Extremely Varied Apps With Greedy Coders ("I'll run the clock on the A72 so my user doesn't feel any lag").

      • (Score: 2) by FatPhil on Friday March 24 2017, @08:02AM (1 child)

        by FatPhil (863) <{pc-soylent} {at} {asdf.fi}> on Friday March 24 2017, @08:02AM (#483563) Homepage
        Nit - it's big.LITTLE, not BIG.little. And yes, I know how you're supposed to make use of it in theory, but my point is that it's still not delivering all of the benefits that were promised. "Customising the scheduler" isn't something that you can just do on a lazy friday afternoon, teams of dozens of engineers working for years still haven't got it right. I know, I've worked with them. I have about 3 years of being a linux kernel developer, in an ARM SoC environment, on my CV.
        --
        If vaccination works, then why doesn't eucharist protect kids against Christianity?
  • (Score: 1, Funny) by Anonymous Coward on Thursday March 23 2017, @08:00PM (3 children)

    by Anonymous Coward on Thursday March 23 2017, @08:00PM (#483373)

    Imagine a Beowulf ARM cluster.

    • (Score: 3, Funny) by DannyB on Thursday March 23 2017, @08:58PM

      by DannyB (5839) Subscriber Badge on Thursday March 23 2017, @08:58PM (#483396)

      Sir, your imagination is limited.

      Imagine a Beowulf cluster of Beowulf clusters.

      It's Beowulf clusters all the way down. An infinite recursion. The final cluster of that infinite fuster cluck is built on ARM chips.

      --
      ALL LIABILITY IS EXPRESSLY DISCLAIMED FOR PERSONAL INJURY OR DEATH THAT RESULTS FROM READING THE SOURCE CODE.
    • (Score: 0) by Anonymous Coward on Thursday March 23 2017, @10:50PM

      by Anonymous Coward on Thursday March 23 2017, @10:50PM (#483425)

      Like this [clusterhat.com] or like this [networkworld.com]?

    • (Score: 0) by Anonymous Coward on Friday March 24 2017, @05:15AM

      by Anonymous Coward on Friday March 24 2017, @05:15AM (#483521)

      Dumb jokes never die, they just migrate to other websites.

(1)