Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 19 submissions in the queue.
posted by Fnord666 on Sunday December 20 2020, @10:02PM   Printer-friendly

After giving a gentle introduction to how computers work at the hardware level, this article gives an interesting thought on the future of computing and how RISC-V fits into it.

By now it is pretty clear that Apple's M1 chip is a big deal. And the implications for the rest of the industry is gradually becoming clearer. In this story I want to talk about a connection to RISC-V microprocessors which may not be obvious to most readers.

Let me me give you some background first: Why Is Apple's M1 Chip So Fast?

In that story I talked about two factors driving M1 performance. One was the use of massive number of decoders and Out-of-Order Execution (OoOE). Don't worry it that sounds like technological gobbledegook to you.

This story will be all about the other part: Heterogenous computing. Apple is aggressively pursued a strategy of adding specialized hardware units, I will refer to as coprocessors throughout this article:

Related:
Why is Apple's M1 Chip So Fast?


Original Submission

Related Stories

Why is Apple’s M1 Chip So Fast? 77 comments

A medium article

On Youtube I watched a Mac user who had bought an iMac last year. It was maxed out with 40 GB of RAM costing him about $4000. He watched in disbelief how his hyper expensive iMac was being demolished by his new M1 Mac Mini, which he had paid a measly $700 for.

In real world test after test, the M1 Macs are not merely inching past top of the line Intel Macs, they are destroying them. In disbelief people have started asking how on earth this is possible?

If you are one of those people, you have come to the right place. Here I plan to break it down into digestible pieces exactly what it is that Apple has done with the M1.

Related:
What Does RISC and CISC Mean in 2020?


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 2, Disagree) by takyon on Sunday December 20 2020, @10:03PM (14 children)

    by takyon (881) <takyonNO@SPAMsoylentnews.org> on Sunday December 20 2020, @10:03PM (#1089706) Journal

    You will use x86, ARM, RISC-V or whatever as needed, or as someone else dictates (Apple will make bank from the switch to ARM). ARM will be able to repel RISC-V from being a main CPU replacement because the licensing costs aren't even that much and it has a gigantic share in many markets. ARM controlling a bunch of RISC-V coprocessors? Sure.

    AMD and Intel x86 will add new coprocessors:

    Intel reveals low-power Clover Falls AI companion chip for EVO-certified laptops [notebookcheck.net]
    Upcoming Zen2+Navi2 APU VanGogh has a CVML [ip] block: Computer Vision and Machine Learning accelerator? [reddit.com]

    I don't think the coprocessors are a lasting advantage for the Apple M1. AMD and Intel are just behind the curve. Future x86 CPUs will become more like smartphone SoCs, since you can get a big impact from a small amount of die area.

    The article notes Nvidia's use of RISC-V. But Nvidia will soon own ARM, pending regulatory approval. It will be interesting to see what they do and if they try to make a push into killing/displacing x86 with their own ARM CPU + GPU combos.

    The big future increases in computing performance will require monolithic 3D integration of memory and compute (the 3DSoC). 3D/stacked SRAM is also needed because it hasn't scaled down very well with node shrinks.

    --
    [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
    • (Score: 1, Interesting) by Anonymous Coward on Sunday December 20 2020, @11:18PM (8 children)

      by Anonymous Coward on Sunday December 20 2020, @11:18PM (#1089724)

      I think all the Android applications that use native code will be the real hurdle to RISC-V gaining traction in the consumer market. This was one of the issues when Intel tried to make a bid for x86 in mobile devices.

      But, I also suspect licensing costs are not insignificant (at scale) or we wouldn't see folks like Western Digital switching to RISC-V for disk controllers. But, this use falls into your "co-processor" use case which isn't very interesting to me as a hobbyist. Apparently licensing costs are not onerous either since Apple, NVIDIA (pre-ARM purchase), and Qualcomm all had their own independent core designs yet continued to license the ARM ISA-- with only Qualcomm really tied to ARM compatibility for selling its products.

      With all the sabre rattling from the US, China will be pretty motivated to use technology not tied to US companies. They might decide to make something amazing with RISC-V. If they do, I hope they consider making it available for export. Currently, the only RISC-V SBCs capable of running (even patched) Linux, are either very low-end without an MMU even, or pretty low-end and very expensive.

      • (Score: 3, Informative) by Grishnakh on Sunday December 20 2020, @11:39PM (4 children)

        by Grishnakh (2831) on Sunday December 20 2020, @11:39PM (#1089728)

        With all the sabre rattling from the US, China will be pretty motivated to use technology not tied to US companies.

        Probably, but ARM Ltd. is a UK company. When Samsung licenses ARM tech and makes their own CPUs, the US isn't involved in any way.

        • (Score: 1, Informative) by Anonymous Coward on Monday December 21 2020, @12:12AM (1 child)

          by Anonymous Coward on Monday December 21 2020, @12:12AM (#1089741)

          NVIDIA bought ARM, and NVIDIA is a US company. Not sure if/how independent ARM is from its parent. I suspect WRT US sanctions, it will be considered US tech export.

          • (Score: 2) by Grishnakh on Monday December 21 2020, @04:22AM

            by Grishnakh (2831) on Monday December 21 2020, @04:22AM (#1089789)

            According to some other comments here, and to Wikipedia (https://en.wikipedia.org/wiki/Arm_Ltd.), it seems that ARM Ltd. is owned by SoftBank, which is Japanese. It's not yet owned by Nvidia, though that's in-process. So for the moment, it's not owned by an American company.

        • (Score: 2) by JoeMerchant on Monday December 21 2020, @01:24AM (1 child)

          by JoeMerchant (3937) on Monday December 21 2020, @01:24AM (#1089757)

          So, didn't SoftBank buy ARM back around 2017, and I think they've sold it again. As I recall, The Rump somehow blessed that SoftBank deal back in '17 and their stock jumped substantially as a result.

          --
          🌻🌻 [google.com]
          • (Score: 0) by Anonymous Coward on Monday December 21 2020, @01:54AM

            by Anonymous Coward on Monday December 21 2020, @01:54AM (#1089766)

            Softbank are currently selling it, to NVidia, if it goes through.

      • (Score: 0) by Anonymous Coward on Sunday December 20 2020, @11:42PM

        by Anonymous Coward on Sunday December 20 2020, @11:42PM (#1089730)

        US government is pretty interested in Risc-V itself.

      • (Score: 3, Interesting) by takyon on Sunday December 20 2020, @11:50PM (1 child)

        by takyon (881) <takyonNO@SPAMsoylentnews.org> on Sunday December 20 2020, @11:50PM (#1089731) Journal

        Yeah, Western Digital storage controllers != main CPU replacement. I think it might be the biggest confirmed volume of RISC-V products, too:

        Western Digital Unveils RISC-V Controller Design [soylentnews.org]

        Western Digital today finally flashed the results of its vow to move a billion controller cores to RISC-V designs.

        Seems like China doesn't have an ARM problem yet (see Rockchip and others). Their biggest concern right now is SMIC getting the smackdown [wccftech.com].

        --
        [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
        • (Score: 0) by Anonymous Coward on Monday December 21 2020, @12:35AM

          by Anonymous Coward on Monday December 21 2020, @12:35AM (#1089747)

          China has made MIPS processors before in a bid for tech independence. So, maybe there is hope. There is also an obscure Russian x86 (without any IME equivalent), so possibly something will come out of there.

          I'd bet US sanctions are long-term probably going to backfire. I wouldn't be surprised, if in 10-15 years China has competitive (not leading) fabs and is selling their equipment to fabs around the world, and the US industry will be on its way out. China has proven itself able to compete in every space where they have put forward effort.

    • (Score: 4, Insightful) by driverless on Sunday December 20 2020, @11:58PM (3 children)

      by driverless (4770) on Sunday December 20 2020, @11:58PM (#1089733)

      You will use x86, ARM, RISC-V or whatever as needed, or as someone else dictates (Apple will make bank from the switch to ARM). ARM will be able to repel RISC-V from being a main CPU replacement because the licensing costs aren't even that much and it has a gigantic share in many markets.

      This also assumes that the ARM market is totally static and immobile. In reality what will happen, on the off chance that RISC-V starts hitting ARM, is that ARM will cut special deals with licensees to keep them loyal. This is the problem when you're competing with an incumbent solely on price, on the remote chance that you start getting some traction all they need to do is readjust their pricing and then your sole advantage disappears.

      Given that RISC-V is currently competing in the tens-of-cents-in-unit-volume controller space, I can't see that they've even got much advantage in the price area.

    • (Score: 2) by theluggage on Monday December 21 2020, @02:11PM

      by theluggage (1797) on Monday December 21 2020, @02:11PM (#1089885)

      You will use x86, ARM, RISC-V or whatever as needed, or as someone else dictates (Apple will make bank from the switch to ARM).

      One of the success stories of the M1 has been how well x86 Apps run via Rosetta2 (maybe aided by M1 features designed to optimise translated code - haven't seen that confirmed). Maybe Transmeta and the Itanium were just ahead of their times? Meanwhile, Apple also have the technology to allow developers to upload bytecode binaries to the App Store, which translates them for various CPUs on demand at download time (I think its mainly for the different ARM variants found in iDevices rather than x86/ARM at the moment)... also, .NET is bytecode based, as is Java and much of Android, "scripting/compile-on-run" languages are being used more and more and, even with lovingly-hand-crafted-C/C++/Swift/whatever, CPU-agnostic-coding is becoming more practical (...mainly as Win16/Win32 gradually fade away, *nix has always been founded on source-level compatibility).

      Hopefully, now the dominant position of Windows-on-x86 has started to wobble, we'll finally start to evolve to the stage where the hardware instruction set is only relevant to the developers of OS kernels and language runtimes. At least for Apple, this is looking like the easiest of their CPU transitions (changes between MacOS 10.15 and 11 that affect both x86 and M1 Macs seem to be the bigger headache) so if they want to switch to RISC-VI or even back to some sort of re-born x86 in the future (...or, more likely, gradually diverge from ARM) it shouldn't be such a traumatic move.

      For Windows, it's a case of making the break from x86 to anything that is the challenge. DOS/Windows has been tied to the x86 family tree since before it existed (...starting with choosing a CP/M86 clone because of the ease of porting 8-bit 8080 CP/M code) since when Apple has successfully changed CPU 3-5* times and Microsoft has failed to change 3** times... NB: not being a fanboi here - it's down to different business models and customer base, and if Apple had "won" in the 80s/90s/00s they'd be equally hamstrung by change-averse corporate customers and licensees. However, if MS/Windows doesn't find a way of changing, it is likely to continue to dwindle and/or MS will morph into a services company (...they've already lost a ton of market share with iOS and Android ruling the mobile market and taking a huge bite out of the domestic PC market and, at the other end, Linux munching away at the server sector - then there's their massive U-turn in supporting Linux/crossplatform stuff).

      * 3 definitely, 5 arguably: 6502 to 68k , 68k to PPC, PPC to x86, x86-32 to x86-64, x86-64 to ARM/M1 (6502 because Apple 2 was still relevant when the IBM PC launched, and DOS was a warmed-over 8-bit 8080 OS; x86-64 because Windows is still in transition, with extensive 32-bit support in Windows 10 and the 32 bit version still in production and, whereas Apple depreciated 32 bit a couple of years back and completely dropped it this year).

      ** ...Counting Windows NT on mips/alpha/sparc/PPC etc. as 1, then Itanium (although those problems went beyond x86 support), then Windows-RT for ARM. Jury still out on Windows 10 for ARM... with the possible delicious irony that it could be "saved" by the M1 Mac (which should easily outsell the Surface X, if it hasn't done already - and there's a ton of interest in running Win10 ARM on it, and there's already several proofs-of-concept ).

  • (Score: 3, Interesting) by Anonymous Coward on Monday December 21 2020, @12:02AM (5 children)

    by Anonymous Coward on Monday December 21 2020, @12:02AM (#1089734)

    Prediction/Opinion:

    OMW will eventually make all the hardware they need for consumers to continue chugging on their Windows (and/or whatever they offer in the future) machines while starving off efforts for people who are trying to make a desktop Linux. They cannot, they will not let this happen. They would love nothing better than for all Linux distros to standardize across the board and eventually become the plaything of one company - easier to buy, easier to squash.

    OMW (or a front comp) will also be behind the acquisition of Canonical. Debian will eventually fall, though it won't appear as if it happened on purpose, just like Ian Murdock's covered up murder and bullshit suicide ruling.

    OMW - they cannot allow anyone to have any fun anywhere, they want it all and they want it now.

    • (Score: 2) by takyon on Monday December 21 2020, @12:09AM (4 children)

      by takyon (881) <takyonNO@SPAMsoylentnews.org> on Monday December 21 2020, @12:09AM (#1089739) Journal

      Microsoft may be developing its own in-house ARM CPU designs [arstechnica.com]

      This afternoon, Bloomberg reported [bloomberg.com] that Microsoft is in the process of developing its own ARM CPU designs, following in the footsteps of Apple's M1 mobile CPU and Amazon's Graviton datacenter CPU.

      Bloomberg cites off-record conversations with Microsoft employees who didn't want to be named. These sources said that Microsoft is currently developing an ARM processor for data center use and exploring the possibility of another for its Surface line of mobile PCs.

      Bloomberg's sources paint the data center part as "more likely" and a Surface part as "possible." This seems plausible, given that Microsoft's chip design unit reports to the Azure cloud VP, with no direct reporting ties to the Surface division. Microsoft declined to comment on any specific plans, saying only that it "[continues] to invest in our own capabilities in areas like design, manufacturing and tools, while also fostering and strengthening partnerships with a wide range of chip providers."

      --
      [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
      • (Score: 0) by Anonymous Coward on Monday December 21 2020, @12:47AM (3 children)

        by Anonymous Coward on Monday December 21 2020, @12:47AM (#1089750)

        Indeed, and thanks for the article. I've been saying this for *years*. Why doesn't M$ just make their own computer hardware? Aside from mice or what not. Why not make their own hardware and try to keep Linux users dog paddling around in a circle by "scrambling the shield harmonics", in other words, changing the tech/design as the years pass to frustrate other Operating System installation/utilization.

        If everyone finally started to say, "FUCK YOU", "GET AWAY FROM ME" to M$ and shunned the company forever, it wouldn't be as powerful as it is.

        • (Score: -1, Offtopic) by Anonymous Coward on Monday December 21 2020, @12:52AM

          by Anonymous Coward on Monday December 21 2020, @12:52AM (#1089751)

          But that ain't gonna work nigga cuz you know they love it when you buy their Microshit toilet paper (Microshit cards) for their XBOX or what not they got you under their skin. They romance ya, they dance ya, they parade you about but as soon as the money stops, your date drops. There are plenty of other sweeties to romance, after all.

          Just watch the videos (and commercials) of people buying/receiving M$ Xbox products. They're squealing like a pig being fucked by a rake handle.

          Fuck that company microshit and fuck the losers who are brainwashed enough to work for a cult.

        • (Score: 1, Insightful) by Anonymous Coward on Monday December 21 2020, @07:56AM

          by Anonymous Coward on Monday December 21 2020, @07:56AM (#1089820)

          Because Microsoft isn't Apple,so whenever they try this it always fails.

          Apple can make their own CPU because Apple already makes their own hardware and doesn't care about compatibility. Apple switches CPUs every ten years anyway, they're actually overdue.

          Remember Windows RT? It's Windows, but with Apple style locked down crap. It was just like regular Windows, except you could only get software through the Windows Store, and you couldn't install any other OS, and you had even less control over updates than regular Windows. It was just plain worse, and nobody bought it.

          For Apple, this isn't about a better CPU architecture, because it isn't better. It's because Apple has a really good ARM license, which isn't quite as good as owning the whole architecture but it's pretty close. Apple really likes to have total control over all their technology (and everything else in their sight) and doesn't want to be beholden to other companies for anything. They also have a strong platform in phones, the only tablet that anyone still actually uses, and would really like it if all of their products ran the same operating system, if not all the same software period. All of this points to switching the Mac to ARM.

          Microsoft, on the other hand, gets nothing from building ARM computers except another DOA product to follow Windows RT, Windows Phone and Windows CE.

          In data centers, it's not really clear that ARM offers much advantage. If it does, Google and Amazon will benefit more. Google runs mostly internal apps, so if they want to switch, they can switch. Amazon has plenty of internal software, but they also have cloud customers, but most of them are using some kind of interpreted language, so if Amazon provides an interpreter, then they can use ARM with a minimum of pain.

          Microsoft, on the other hand, has a cloud environment that is, by the standards of such things, extremely Windows oriented. The whole point of Windows is that you can bring all your baggage with you. If you have to make a clean break, you probably aren't keeping Windows at all.

          But all of this presupposes that there's some big advantage to ARM, but there isn't. Apple's M1 is a good CPU, but it's only really outstanding in power consumption - attributable partly to ARM's history of being low power first and high performance second, but mostly to Apple being on a more advanced manufacturing node than everyone else. This is a one time benefit from having a clean sheet design, and it won't last once AMD's Zen 4, widely expected to be on 5nm, comes out in early 2022. Who knows, maybe even Intel will get unstuck from 14nm* eventually!

            * Intel's process naming is different, so their 14 is everyone else's 10, and so on; if Intel gets 7nm out in 2021, it'll be comparable to everyone else's 5nm

        • (Score: 0) by Anonymous Coward on Monday December 21 2020, @10:59PM

          by Anonymous Coward on Monday December 21 2020, @10:59PM (#1090068)

          What do you think is in an Azure data center?

  • (Score: 1, Informative) by Anonymous Coward on Monday December 21 2020, @12:22AM

    by Anonymous Coward on Monday December 21 2020, @12:22AM (#1089744)

    Isn't this a sequel of the guy who talked gobbly-gook about how M! is so fast?

    If part 1 is all gobbly-gook, why bother with part 2 premised on the same gobbly-gook?

  • (Score: 3, Insightful) by shortscreen on Monday December 21 2020, @01:13AM (3 children)

    by shortscreen (2252) on Monday December 21 2020, @01:13AM (#1089754) Journal

    The article spends a lot of words talking up special-purpose silicon and RISC V's customizable instruction set, but then at the end says complex instructions giving higher performance is a misconception. They say RISC V is great because it takes fewer transistors than ARM, but they also say that having a pile of general purpose CPU cores is not the best use of silicon. If there is an overarching design philosophy here I'm not sure what it is supposed to be. Is a small CPU core glued to a DSP coprocessor really that different than a larger singular CPU core with built-in DSP instructions?

    With transistor budgets being what they are now, I'd think that throwing in everything you can think of is basically the only viable strategy. The old RISC idea of simple but fast is dead because of clock speed limits.

    • (Score: 5, Insightful) by JoeMerchant on Monday December 21 2020, @01:36AM

      by JoeMerchant (3937) on Monday December 21 2020, @01:36AM (#1089761)

      I think the whole RISC vs CISC debate is pointless. RISC taken too far is outgunned by specialized designs for specific applications. CISC taken too far is often a waste of idle silicon. Whatever architecture you have will be optimal for some subset of tasks and sub-optimal for a (usually much) larger subset of tasks.

      Whatever benchmarks are developed are imperfect approximations of actual workloads, and they usually become worse and worse approximations of "today's" actual workloads as time goes by.

      The real questions for me are: what kind of support can you get under gcc/Debian? What's the system power draw? (aka: does it need a fan?)

      I've got a PiZero that does nothing but listen for http messages from another system and play sound-clips in response. That other system is an Intel NUC running ZoneAlarm, sending those http messages when motion is detected in IP cam feeds. That same Intel NUC drives our home theater system, so the PiZero serves to make the audio notices available 24-7 with no configuration complications related to whatever audio/video thing we're doing at the time. Point being: these are my applications, and while one processor could serve them all, having a specialized camera-announcer in the system makes the UX much simpler. Other applications will have other needs which could be addressed in literally hundreds of ways, but the optimal solutions will vary depending on the particular user's situations.

      M1 gets blazing benchmarks for some Apple users, maybe a lot of them. Kudos. Most Apple users I know only really need Facebook and a GPS mapping app that works better than the one they've got.

      --
      🌻🌻 [google.com]
    • (Score: 1, Insightful) by Anonymous Coward on Monday December 21 2020, @04:01AM (1 child)

      by Anonymous Coward on Monday December 21 2020, @04:01AM (#1089782)
      Summary is it's no longer so simple as "simple instructions". What you want is for stuff to be easier to be sped up. So it's fine for an instruction to be very complex if you can easily split it to many micro-ops, run those on many execution units and reliably execute everything faster. And hopefully that instruction is popular enough to be worth it.

      The main advantage of ARM over x86 is the instructions are the same length. Practically everything else is easily hidden "under the carpet" of current transistor budgets. When instructions are all the same length it's easier for the decoders to speculatively split many instructions ahead into micro-ops.
      • (Score: 1, Interesting) by Anonymous Coward on Monday December 21 2020, @07:02AM

        by Anonymous Coward on Monday December 21 2020, @07:02AM (#1089819)
        So one way to use RISC is to think of it as a CISC processor with reprogrammable microcode? I've heard of some people writing an interpreter for a C or FORTRAN-focused instruction set that fits entirely within the L1 cache of a RISC processor that has been heavily optimised to exploit instruction-level parallelism and all the execution units available as much as possible. I wonder if that approach would be superior to compiling the code directly to raw RISC instructions, given that cache misses tend to be rather expensive even on modern CPUs. The overhead of decoding these specialised instructions might even be less than the cost of a cache miss.
  • (Score: 0) by Anonymous Coward on Monday December 21 2020, @01:44AM (3 children)

    by Anonymous Coward on Monday December 21 2020, @01:44AM (#1089763)

    M$ tentacles are everywhere. They also have former employees in all sorts of fields. It reminds me of Scientology, actually. A strange combo of a mafia and Scientology wrapped into one.

    Merry Christmas – don’t buy M$!

    • (Score: 3, Funny) by Anonymous Coward on Monday December 21 2020, @01:51AM (1 child)

      by Anonymous Coward on Monday December 21 2020, @01:51AM (#1089765)

      If you want a vision of the future of computing, imagine teledildonics with a petaflops level of performance ramming into a human ass - forever.

      • (Score: 3, Informative) by Subsentient on Monday December 21 2020, @10:06AM

        by Subsentient (1111) on Monday December 21 2020, @10:06AM (#1089845) Homepage Journal

        Yeah that sounds about right. When staring into the sacred peanut butter, the deep gerbil magic has revealed as much to me.

        --
        "It is no measure of health to be well adjusted to a profoundly sick society." -Jiddu Krishnamurti
    • (Score: -1, Troll) by Anonymous Coward on Monday December 21 2020, @02:26AM

      by Anonymous Coward on Monday December 21 2020, @02:26AM (#1089768)

      [...] A strange combo of a mafia and Scientology wrapped into one.

      You must be referring to the City of Yakima government: https://www.yakimawa.gov/ [yakimawa.gov]

      They might even have certain vaccines for cheap. Tell 'em The Big Goombah sent 'ya.

  • (Score: 5, Interesting) by bzipitidoo on Monday December 21 2020, @02:30AM (8 children)

    by bzipitidoo (4388) on Monday December 21 2020, @02:30AM (#1089770) Journal

    I find it strange that there is this much room for improvement. Like, hardware decoding of MPEG4 has been around for, what, at least a decade now? And smart phone cameras that detect faces, how do they do it? They don't have the raw computing power to do it with general purpose circuitry, has to be special purpose.

    As for offloading work from the main CPU to other processors, computers have been moving in that direction for decades. We have advanced tremendously from the days of the Apple II computer in which the CPU did everything, and I do mean everything-- ran the floppy disk drive, the speaker, the graphics, and the keyboard and joystick. The floppy drive controller on the Apple II was incredibly primitive, with the main CPU doing the low level work of pulsing the stepper motor at the correct times in order to move the arm, and reading the individual bits off the floppy until it had discerned the word boundaries, then 10 bits at a time and translating that to 8 bit bytes. The timing had to be just right, and what the machine did was execute timing loops. The audio was much the same, all the CPU could do was click the speaker. If done at a high enough frequency, what came out was a musical note. That's why the sound was often extremely grainy. Clever games would interweave graphics updates with speaker clicks.

    By the start of the 1990s, hard drive and floppy drive controllers had brains of their own, we had sound cards with dedicated processors, and graphics cards with their own memory had become standard. The 486 at last killed the practice of splitting the floating point math off to an optional hardware component, a "coprocessor", that often wasn't present, for another big boost in speed (like, 50x), for tasks that used floating point math. To be sure, there were regressions such as the Winmodem and the shared memory for integrated graphics, but overall, the trend was definitely towards maximizing the offloading of all work possible.

    • (Score: 2) by JoeMerchant on Monday December 21 2020, @03:06AM (3 children)

      by JoeMerchant (3937) on Monday December 21 2020, @03:06AM (#1089774)

      the Apple II was incredibly primitive, with the main CPU doing the low level work of pulsing the stepper motor at the correct times in order to move the arm, and reading the individual bits off the floppy until it had discerned the word boundaries, then 10 bits at a time and translating that to 8 bit bytes.

      By comparison, the contemporary Atari 400/800 system had a floppy drive system that was actually compliant with FCC rules at the time they designed it (not using the future part C exceptions). As such, the interface between the 88K capacity disk reader and the main computer was a painfully slow, shielded serial cable. The drives themselves cost something like 400 to 500 dollars at a time when the keyboard/CPU was selling for 700 to 800, and I suspect they had their own onboard processors to handle the drive interface, not to mention additional shielding resembling today's microwave ovens, leaving the main CPU with little to do but sit and wait for data to arrive on the serial interface. Good times.

      --
      🌻🌻 [google.com]
      • (Score: 2) by bzipitidoo on Monday December 21 2020, @10:00AM (2 children)

        by bzipitidoo (4388) on Monday December 21 2020, @10:00AM (#1089844) Journal

        Yeah, the Apple II sure was noisy with the RFI. Screwed up TV reception from over 100 feet away.

        Stock Apple DOS took about 45 seconds to boot, because of some more unpicked low hanging fruit. The stock DOS was doing double buffering, and that took just enough time that the disk would spin just past the start of the next sector, and have to go all the way around. 15 revolutions to read 16 sectors, the worst possible. Didn't take long for the appearance of a bunch of 3rd party DOSes that fixed that glaring inefficiency. Those could read all 16 sectors of a track in one revolution of the disk, and boot in about 5 seconds. They all still missed one more optimization, that of staggering the start of the next track so that it rotated under the head at just the right moment for the drive to immediately resume reading. Instead most were a little late, and had to wait nearly an entire revolution for the start to come back around. Fairly trivial next to the big optimization, but still, shaves off a few more tenths of a second.

        Early computers were full of awful programming of that sort. I often hacked into games so I could optimize particularly slow parts. Such as, Origin Systems (maker of the Ultima series of games) Moebius repeated the mistake of stock Apple DOS and was exceedingly slow at disk access. Some sector interleaving cut disk access times during combats from 20 seconds to 5 seconds.

        • (Score: 0) by Anonymous Coward on Monday December 21 2020, @04:34PM

          by Anonymous Coward on Monday December 21 2020, @04:34PM (#1089925)

          The RF modulator was noisy, not the rest of it. That's why Apple not only sold it separately, but farmed it out to a different company. This had the effect of encouraging people to buy Apple branded monitors rather than using TVs... .which in turn had better display quality. It was a win for everyone (except the wallets of the Apple owners, but when has Apple ever been good there?)

          The disk drive was not "primitive," it was low parts count. Apple's 5.25" drive was half the size of Commodore's, didn't need external power, and performed about a million times better.

          I'm not even sure what you're thinking of with the disk sector thing. That just isn't right. The only thing I can think of is that when Apple introduced 3.5" disks, the one for the //gs had a different interleave than the one for the //e and //c, because the //gs version was CPU driven, like the 5.25", but the one for the 8 bit computers had a coprocessor. Because of that, while the two systems could read each other's disks, performance was terrible on the wrong kind of drive. In practice this didn't matter very much because the two systems usually ran different software, and the performance penalty went away if you copied the disk.

        • (Score: 2) by JoeMerchant on Monday December 21 2020, @07:57PM

          by JoeMerchant (3937) on Monday December 21 2020, @07:57PM (#1090002)

          When I was 16 I optimized an early BBS program written in BASIC, identified a key subroutine that I could recode in assembler and general cleanup for about 50x speedup in normal performance... low hanging fruit indeed.

          --
          🌻🌻 [google.com]
    • (Score: 0) by Anonymous Coward on Monday December 21 2020, @06:26AM

      by Anonymous Coward on Monday December 21 2020, @06:26AM (#1089815)
      In contrast, the Commodore 64's 1541 disk drive had an embedded 6502 inside it that was almost as powerful as the CPU of the Commodore 64 itself. The C64 also put a lot of its functions into specialised hardware, like the SID to do sound, the VIC II for graphics, and so on.
    • (Score: 3, Interesting) by Rich on Monday December 21 2020, @09:29AM (2 children)

      by Rich (945) on Monday December 21 2020, @09:29AM (#1089838) Journal

      OCD nerd nitpick:

      and reading the individual bits off the floppy until it had discerned the word boundaries, then 10 bits at a time and translating that to 8 bit bytes.

      It would skip 10 bits by reading the auto-sync bytes (8 bits would be read, 2 bits would slide) to discern the word boundaries and then 8 bits were read, translated into 342 6 bit nibbles (DOS 3.3 and later), and later assembled into 256 bytes. The assembly phase is important, because with the DOS RWTS the floppy needed interleaved sectors for that task, whereas later reading routines (Pascal and others) could decode in one sweep without interleave.

      The Commodore GCR assembled 8 from 10 bits, IIRC.

      On the topic of the article, I think that any accelerators of meaningful complexity will require a software ecosystem that only the very biggest of players are able to pull off. And it will work as an "appliance" thing for them, but not for general computing. But maybe the R-V crowd could define something which works as a "shader" or "neuron", that might slowly build such an ecosystem at broader-than-appliance scope - IF someone makes the hardware at non-ridiculous prices.

      • (Score: 2) by bzipitidoo on Monday December 21 2020, @05:04PM (1 child)

        by bzipitidoo (4388) on Monday December 21 2020, @05:04PM (#1089934) Journal

        My memories of the workings of the Apple II floppy disk aren't perfect, nor did I fully grok all the details, just what I needed to crack the copy protection schemes, which made abundant use of the low level control to create their own incompatible schemes that were designed to be difficult to copy, of course. I could be wrong but I do recall it was 10 bits on the floppy to 8 bits in memory. Yes, version 3.3. Apple DOS 3.2 used a different, less efficient coding scheme that could fit only 13 sectors on a track, maybe you've misremembered that? The aftermarket DOSes I guess did the conversion to bytes on the fly.

        For aligning on the start of a track and I should guess the byte boundaries too, the 5.25" PC floppy drive sensed a physical hole near the inner ring of the floppy. That's a classic technique too: cutting down on the computational work by relying on the physical. For the same purpose, the 3.5" drive used a different physical method, of providing a slot for a cog offset from the center. That's the source of that classic slide and ka-chunk noise that 3.5" floppy drives make when a floppy is inserted.

        One of the differences in the use of 3.5" floppies between Mac and PC was sort of philosophical. The PC floppy had a button that mechanically ejected the disk. The Mac did not, instead relying on software control. The PC's method put the user more in control. Floppies very rarely got stuck. The downside was you could lose data if you insisted on ejecting the disk while the drive was using it. But the trust in users was well founded. I never lost any data from being too hasty and ejecting a hair too soon. I had more trouble from careless, numbnuts acquaintances who didn't insert the 5.25" floppy quite straight and level enough, forcing the disk in anyway and jamming the drive, which put a crease in the medium and ruined it. Lucky that didn't also ruin the drive, I suppose. The powered tray is one of the things I disliked the most about the CD-ROM. Always kept a paperclip handy, to manually eject CDs.

        • (Score: 2) by Rich on Tuesday December 22 2020, @06:41PM

          by Rich (945) on Tuesday December 22 2020, @06:41PM (#1090333) Journal

          Sorry to everyone else for abusing "The Future of Computing?" for "The Past of Computing!". But then TFA wasn't of a quality that warrants a focused discourse ;)

          Pre-DOS 3.3 used a 5 out of 8 encoding. You needed to change PROMs on the interface card (maybe even the FSM-PROMs?!) for upgrading to DOS 3.3. Assuming a Disk II card in slot 6, You read $C0EC and get an 8-bit "nibble", high bit always set because reading starts at a flux transition which reads as "1". Flux is NRZI, hence the 10-bit autosync works, and you can't have more than two consecutive zeroes (which was one for pre-3.3). For data nibbles there is also the special requirement that requires two consecutive data bits (D5 and AA are out and can be used as sector markers).

          The Disk II genesis is like an alien technology drop: Woz figuring out that he could drop all the analog stuff of the SA400 drive, conceiving a digital Finite State Machine that could decode it (probably only 4 payload-bits per 8-bit nibble in the beginning), Al Shugart trying to screw them over by supplying defective drives, Woz still figuring that out and getting it to work. The idea to be able to pack 5 bits into the nibble. Stuffing the boot loader into 256 bytes (building the nibble table, starting the drive, recalibrating the head, scanning for sector 0, decoding sector 0, and jumping to the decoded data, all in position-independent code). Later figuring out they could do two zeroes (maybe inspired by what Commodore was doing then with the 2040?). Meanwhile Woz also wrote floating point code (also black magic at that time) that was SO close to be integrated into his BASIC, which would have meant "Micro-Soft" would have been out, we would have gotten source for the BASIC, and computing might have taken another course.

          But let me close the circle and get back on topic: These "accelerators" mentioned in TFA are something that requires similar milestone advances in software, not necessarily with inventive genius, but more with handling their complexity. I've written firmware for a standalone Disk II compatible drive connected to UNIX boxen, I have reversed a few nasty protection schemes, I've written a fast 3 D renderer with texture mapping in software, I could completely follow what the guys who subverted the new "Game&Watch" did a few weeks ago - and yet I am completely baffled when the crowd from the MESA 3D world presents a new "reverse engineered" graphics driver for some new GPU or SoC core, which is roughly equivalent to the proposed "accelerators" in complexity. That also looks to me like some technology drop - but maybe not by aliens. I don't know. Maybe IDA Pro is that good, I haven't worked with it, but watching registers...? And then, after the magic discoveries and the initial glxgears demo, the progress always stalls in some way. So figuring out how the workflow really looks like here will give us an idea about how good the new breed of coprocessors can be supported.

  • (Score: 2, Insightful) by Anonymous Coward on Monday December 21 2020, @05:45AM (1 child)

    by Anonymous Coward on Monday December 21 2020, @05:45AM (#1089810)

    We've gone through multiple cycles of centralising and decentralising functions.

    It's happened in PCs multiple times, even, including questions of whether to do RAID in hardware or software, where to put video intelligence, and networking, and ...

    It's happened in mainframes, and minis, and really the only new thing here is scale and integration in combination. Holy shit, if you put lots of parts on the same die, they can talk really quickly! Who saw that coming? Only everybody who'd even looked at the state of things for decades. I wrote something pretty much like this back in the '90s.

    What fanbois are falling all over each other to miss is the implications for flexibility. But hey, sure, system-on-chip sure is convenient! Until it's not.

    • (Score: 2) by maxwell demon on Monday December 21 2020, @06:39AM

      by maxwell demon (1608) on Monday December 21 2020, @06:39AM (#1089817) Journal

      Indeed, the x86 FPU started out as a coprocessor.

      --
      The Tao of math: The numbers you can count are not the real numbers.
  • (Score: 3, Interesting) by Anonymous Coward on Monday December 21 2020, @06:18AM

    by Anonymous Coward on Monday December 21 2020, @06:18AM (#1089813)
    Out of order execution is heavily used in modern processors already and led to several notable and difficult to mitigate security threats, such as Spectre and Meltdown. I wonder if there will also be a raft of similarly messy security vulnerabilities waiting to be exploited in the M1.
  • (Score: 2) by inertnet on Monday December 21 2020, @08:58AM (1 child)

    by inertnet (4071) on Monday December 21 2020, @08:58AM (#1089830) Journal

    It's still Apple, so I don't care what they make.

    • (Score: 0) by Anonymous Coward on Monday December 21 2020, @09:53AM

      by Anonymous Coward on Monday December 21 2020, @09:53AM (#1089842)

      You will be forced to care. The whole industry is going to copy their homework.

  • (Score: 0) by Anonymous Coward on Monday December 21 2020, @06:42PM (1 child)

    by Anonymous Coward on Monday December 21 2020, @06:42PM (#1089969)

    They all either get integrated into the CPU or they turn out to be useless.

    Math coprocessors started out separate but then were integrated into the CPU.
    GPUs are still separated if you need maximum performance but mostly are now integrated into the CPU.
    Remember PhysX? Useless. It turned into a software library, and I haven't heard about it in years.
    Remember Aureal A3D audio? Completely dead. Same with wavetable synth. Now there's some Dolby 3D thing but it was never anything but software.
    Modems turned into a thin hardware interface to the phone line and all their signal processing turned into software. Then the whole technology vanished.
    Even most of the old southbridge functions are moving onto the CPU - things like USB, disk and network controllers.

    If everybody needs a feature, it gets moved onto the CPU. If only some people need a feature, it will be done in software if it possibly can. This is only more true now that CPUs have so many cores.

    • (Score: 0) by Anonymous Coward on Monday December 21 2020, @07:28PM

      by Anonymous Coward on Monday December 21 2020, @07:28PM (#1089994)

      Modems turned into a thin hardware interface to the phone line and all their signal processing turned into software.

      No they didn't. That was a WinModem, and it sucked. Unless you're talking about firmware on the modem itself; but modems that used the main CPU to do signal processing were notorious.

(1)