Stories
Slash Boxes
Comments

SoylentNews is people

posted by janrinok on Friday September 27, @09:12PM   Printer-friendly
from the amazing-but-why-did-you-do-it dept.

Linux boots in 4.76 days on the Intel 4004

Historic 4-bit microprocessor from 1971 can execute Linux commands over days or weeks.

Hardware hacker Dmitry Grinberg recently achieved what might sound impossible: booting Linux on the Intel 4004, the world's first commercial microprocessor. With just 2,300 transistors and an original clock speed of 740 kHz, the 1971 CPU is incredibly primitive by modern standards. And it's slow—it takes about 4.76 days for the Linux kernel to boot.

Initially designed for a Japanese calculator called the Busicom 141-PF, the 4-bit 4004 found limited use in commercial products of the 1970s [...]

[....] If you're skeptical that this feat is possible with a raw 4004, you're right: The 4004 itself is far too limited to run Linux directly. Instead, Grinberg created a solution that is equally impressive: an emulator that runs on the 4004 and emulates a MIPS R3000 processor—the architecture used in the DECstation 2100 workstation that Linux was originally ported to.

If it can run a C compiler, it can probably run DOOM.

See Also:


Original Submission

This discussion was created by janrinok (52) for logged-in users only. Log in and try again!
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 0) by Anonymous Coward on Friday September 27, @10:01PM

    by Anonymous Coward on Friday September 27, @10:01PM (#1374848)

    That's pretty cool. Now how about netbsd. :)

  • (Score: 5, Informative) by pTamok on Friday September 27, @10:30PM (8 children)

    by pTamok (3042) on Friday September 27, @10:30PM (#1374854)

    It's a technical tour-de-force. I strongly recommend reading Dmitry's article.

    I skipped dealing with RAM and just assumed that the "current" MIPS instruction will be in r8:r9:r10:r11:r12:r13:r14:r15 registers, MSB-to-LSB. Yup...half of the registers are used just to hold the instruction.

    In some cases, signed 32-bit division can take up to 80,000 instruction cycles thanks to needing to operate on only a nibble at a time and ISA design of the 4004.

    I have never before seen a CPU that lacked ability to do basic logical operations, until I saw the 4004 manual. The 4004 lacks ability to do any of them. There is no logical AND, no logical OR, and no XOR.

    I then went on to write what I believe to be the world's smallest SD card driver in existence. It fit into 190 bytes and would successfully init a card, get its size, and allow sector read and write. I also tried this driver on my dev board from earlier, connected to a real SD card, and found that it worked! Woo hoo!

    Sadly, adding more RAM slowed the boot down. Linux creates data structures at boot that track physical pages and with more pages, more of them had to be created. Some kernel data structures are also dynamically sized based on available memory, and that also suffered. At this point in time with 16MB of RAM, boot time was projected to be 7.19 days. Back over a week! Womp!

    ...added a small additional speed benefit: 4.76 days to boot! This calculates out to being around a 70Hz MIPS machine if the 4004 is run at 740KHz.

    My testing shows that the actual emulated speed of the MIPS guest is around 70Hz at 740KHz. I run the 4004 at 790KHz, so the emulated guest is thus operating at about 74.73Hz. So time for the guest is dilated by 14,030x. This means that a virtual second is, in real life, around 3h54. Four hours per virtual second, basically!

    I too have had the problem of moving lots of image files via ifuse. Nemo didn't crash, but lied about progress.

    I'm really, really glad people do projects like this.

    • (Score: 5, Informative) by pTamok on Friday September 27, @10:55PM (1 child)

      by pTamok (3042) on Friday September 27, @10:55PM (#1374856)

      I first encountered Dmitry's work with his price tag hacking. [dmitry.gr].

      He's talented. And persistent.

    • (Score: 5, Informative) by VLM on Saturday September 28, @12:29AM (5 children)

      by VLM (445) on Saturday September 28, @12:29AM (#1374867)

      In some cases, signed 32-bit division can take up to 80,000 instruction cycles

      Some Motorola chips (6809, etc) had very close to one or two instructions per cycle but other CPU families had incredibly large number of clock cycles per instruction.

      I remember completely WTF at the venerable 8051 I wanted to delay a couple clock cycles to abuse a poor I2C device that was the "slow kind" (long story) and even a single NOP instruction on a classic 8051 took 12 clock cycles.

      Old timers might remember CPUs where NOP took 4 or even 3, clock cycles per instruction. I don't remember which processor took 3 cycles.

      This used to be a classic "Mess with the students" lab assignment back when they taught assembly instead of higher level languages. Ask the kids to make a square wave and rig the question so they'll add about 24 NOPs assuming NOP executes in one clock cycle, but on shitty architectures NOPs take QUITE a few clock cycles so they hook up a scope or freq counter and WTF I'm generating a square wave at a quarter or eighth or twelveth whatever they expected based upon the number of NOPs. I saw that trick coming from a mile away because I messed around with home computer assembly language in the 80s so I already knew what was going to happen. Was still funny to see.

      Another "hilarious" EE lab was making the kids generate very weird time delays by tricking them into using obscure instructions. So if a NOP takes 4 clock cycles on a Z80, then you "can't" make a ten clock cycle time delay on a Z80. Well hell no, not with that attitude you can't. After the kids had a temper tantrum the TA would point out that actually you can, you just insert "junk" instructions like a single NOP takes 4 cycles and IIRC a LD SP,HL takes 6 cycles so if you don't give a F about SP or HL then you can indeed generate an "impossible" precision ten clock cycle time delay on a Z80.

      Very few people outside of assembly language programmers seem to understand that the venerable recently cancelled Z80 was a 4-bit chip, it had a 4-bit ALU so byte additions took a remarkably long time compared to what you'd expect if you incorrectly thought it was an 8 bit processor.

      IIRC if you set it up "correctly" a LDIR (sort of an assembly language STRNCPY) on a Z80 could take a remarkable number of clock cycles to complete.

      Math coprocessors were faster than CPUs but non-assembly programmers thought they were magic and instant; not so; IIRC an 8087 FDIV took like "two hundred" clock cycles which is much faster than the CPU could ever do it but still pretty darn slow. I remember some "multiply a float by a logarithm of a float" that took over a thousand or "thousands" of clock cycles.

      My point is 80,000 clock cycles would really not be all that bad of performance. Some assembly language instructions on some architectures are REALLY slow on real hardware.

      • (Score: 5, Informative) by RS3 on Saturday September 28, @01:24AM (4 children)

        by RS3 (6367) on Saturday September 28, @01:24AM (#1374870)

        You're reminding me of one of many old tricks: use LUT (Look Up Table) [wikipedia.org] for sine, squares, character encoding / decoding, whatever math you needed to do really fast.

         

        • (Score: 4, Interesting) by anubi on Saturday September 28, @03:04AM (3 children)

          by anubi (2828) on Saturday September 28, @03:04AM (#1374878) Journal

          Precomputed table-driven...

          I still do that on my Arduino stuff

          The latest was a sunrise/sunset time estimator that involves the "sunrise equation" and "analemma".

          It involved tangents and arctangents. I did not need a full float precision...I was plenty happy to get within five minutes. So it's all based on integer math and precomputed tables.

          The idea being to connect this thing to solar panels and let it go. Using adaptive filter algorithms, it would eventually zero in on a pretty precise "solar time" ( " solar noon " is defined as mid-day between sunrise and sunset, "solar midnight " being defined as midpoint between sunset and sunrise, and a " day " being the interval between noons or midnights which by definition will be 24 hours. ).

          I can cut considerable time off it's "housewarming" time if I give it the Julian Day, longitude, latitude, and the local time of day when it's brought up, but eventually, it would figure it out on its own. If it has to figure it out on its own, it may take centuries to arrive at the precise Julian Day, albeit it will get close within a few years. ( My Julian epoch time is four years of days, which easily fits in an int16 ).

          This is meant for complete off-grid applications dealing with agricultural needs, which are season-driven, where the exact day of the year is of little meaning, yet the season is. One Julian Day off is less than 0.3% error. ( 1 out of 365.25 ).

          It is just one input to a larger system that also considers local weather and custom considerations to control agricultural chores. This one only observes insolation to sync to earth rotation, seasonal, and latitude influencers. ( Longitude only influences the time offset of what my machine considers noon to be, and what the local humans consider to be noon. )

          A 4004 , programmed in machine language, could probably do this in real-time, including bit-banging it's status back hourly on a one-bit serial port. But I will go way overkill and use an Arduino. Low Power.

          I am just doing this for fun...like booting Linux on a 4004, but far, far, easier to do. What I am doing is way below beginner level compared to the parent story!

          Never underestimate the robustness of simplicity.

          --
          "Prove all things; hold fast that which is good." [KJV: I Thessalonians 5:21]
          • (Score: 2) by RS3 on Saturday September 28, @02:20PM (1 child)

            by RS3 (6367) on Saturday September 28, @02:20PM (#1374907)

            Very interesting. You're a tenacious one indeed. I am too, sometimes / often.

            I was thinking of the good old days (ahem) when RAM was very expensive, but ROM not so bad, so you could implement LUT in ROM and bank-switch / map in the ROM when you needed a LUT value. Now that RAM is huge and cheap:

            There are lots of math libraries for ARM / Arduino. If you need speed / real-time, you can use (slow) math library to "seed" a RAM-based LUT (array) at bootup. You probably already know that. Wrote it in case someone doesn't know and might benefit from the idea.

            I had never really looked at the 4004 until this article. I always thought it was a 4-bit machine and had maybe 16 op codes, but didn't care to look into it further. The main hardware bus is 4 bit, but it's much more of an 8-bit machine, all op codes are 8 bits, some are 16 bit. 12 bit address bus is pretty small, but you can do a bank switch. Intel made several support chips for it too, and with some more logic it could be pretty useful in its day.

            But that's all for fun. As you point out, with today's ARM and other amazing CPUs on the market, no need to waste time, effort, and PC board space just to make an older CPU work (again, unless you're doing it for fun / education).

            • (Score: 3, Informative) by VLM on Saturday September 28, @05:55PM

              by VLM (445) on Saturday September 28, @05:55PM (#1374929)

              I had never really looked at the 4004 until this article.

              You are in for a treat.

              Kids these days are used to five or more IO devices sharing a pin. Well, the 4004 had data, address, and I/O sharing a four pin bus, that's some multiplexing!

              Plenty of TTL worked off 5 volts and that eventually became a standard, until the rise of 3.3 volt systems, but the 4004 could eat most any CMOS voltage, although most ran it off 5V for compatibility. IIRC you were "supposed to" run it off 15V or maybe that was the special embedded controller version of the chip, I don't remember exactly. Anyway, I think its power system would be considered weird to modern people.

              Kids these days are used to kilobyte to megabyte sized stacks in ram. The 4004 stored its 3-level stack on chip not in ram. Which means you can write subroutines, kind of, on a ROM only machine. However recursion is a bit limited, good luck with calculating five factorial LOL on a three level stack. That "run without RAM meme" was also why it had a large number of on chip registers, might not have any ram at all for storage so use registers for everything.

              Later on Intel got into the product tying strategy of making the clock requirements so weird that you had to buy their clock chip; this more or less predates that. So you can't assume that "old intel means weird clock requirements".

              At the time if you read industry magazines for retro reasons it got ripped on pretty bad (unless they bought an advertisement) because it was too fast for embedded control stuff and too slow for primary computing, so what are the apps for this thing? So it's like 100 times too fast to be a car engine emissions computer, but 100 times too slow to replace a PDP-10 mainframe processor, so what do you do with it exactly? Well, they found some uses eventually LOL.

              If you like the 4004 you'd find the 14500 pretty amusing. In a lot of ways the 14500 is a "you don't want to pay $500 for a 4004, well buy a 14500 for pocket change" competitor. If you feel a 4004 is too powerful and too expensive for your industrial PLC or something, a few years later you could buy a 14500, which is kind of a one-bit computer for people who think in ladder language from PLCs.

          • (Score: 2) by RS3 on Saturday September 28, @02:24PM

            by RS3 (6367) on Saturday September 28, @02:24PM (#1374908)

            Meant to add: there are many PIC microcontrollers that might be more what you're looking for. I have no connection to them, have worked with some, but never developed for them. Point is some are very tiny, meaning super simple, just enough to get the job done, everything on-chip, no OS needed. But of course there are SoC out there if you need complex.

  • (Score: 5, Funny) by Tork on Friday September 27, @10:31PM

    by Tork (3914) Subscriber Badge on Friday September 27, @10:31PM (#1374855)
    "This processor is far too limited to run a modern OS, so we emulating processor from 20 years in the future to get around that limitation..."

    Sometimes I feel like I don't belong here.
    --
    🏳️‍🌈 Proud Ally 🏳️‍🌈
  • (Score: 3, Interesting) by janrinok on Saturday September 28, @12:13AM (2 children)

    by janrinok (52) Subscriber Badge on Saturday September 28, @12:13AM (#1374866) Journal

    I wonder if Linus himself ever hears about projects like this and, if he does, what does he think of them?

    --
    I am not interested in knowing who people are or where they live. My interest starts and stops at our servers.
    • (Score: 4, Funny) by Tork on Saturday September 28, @12:47AM (1 child)

      by Tork (3914) Subscriber Badge on Saturday September 28, @12:47AM (#1374868)
      Prolly something involving a lot of swear words, I'd imagine.
      --
      🏳️‍🌈 Proud Ally 🏳️‍🌈
      • (Score: 2) by Freeman on Monday September 30, @01:38PM

        by Freeman (732) on Monday September 30, @01:38PM (#1375110) Journal

        With some **** censors *** due to the ******* *****s. Ah, in his mind you say, he'd be thinking about how he'd have to reword it due to the ******* *****ers.

        --
        Joshua 1:9 "Be strong and of a good courage; be not afraid, neither be thou dismayed: for the Lord thy God is with thee"
  • (Score: 2) by looorg on Saturday September 28, @01:07AM (1 child)

    by looorg (578) on Saturday September 28, @01:07AM (#1374869)

    With just 2,300 transistors and an original clock speed of 740 kHz, the 1971 CPU is incredibly primitive by modern standards. And it's slow—it takes about 4.76 days for the Linux kernel to boot.

    Not exactly a Doom system then. Can you have zero FPS?

  • (Score: 3, Interesting) by bzipitidoo on Saturday September 28, @04:12AM (3 children)

    by bzipitidoo (4388) on Saturday September 28, @04:12AM (#1374882) Journal

    Okay, 4.76 days is way more time than I ever spent waiting on old hardware. I tried running a Linux distro from circa 2010 on a 133 MHz Pentium based laptop that came with Windows 98 and didn't have any accelerated graphics-- no NVidia or ATI-- and only 96M of memory, the maximum the hardware could support. Took 30 seconds for Firefox 3.5 to come up, and a full 5 minutes for Stellarium to start. Once up, Firefox was somewhat usable, but Stellarium really wasn't.

    I still have that laptop, but it no longer works. I have no idea why. Didn't spill anything on it or drop it, but it's completely dead. Did at least light up the backlight that last time it showed signs of life, now it won't even do that. Maybe the NiCd batteries or the power brick overvolted the machine and fried it? Can NiCd batteries do that? Not that it matters-- the machine is hopelessly antiquated, trash even if it did still work. I simply haven't gotten around to junking it.

    • (Score: -1, Redundant) by Anonymous Coward on Saturday September 28, @05:47AM

      by Anonymous Coward on Saturday September 28, @05:47AM (#1374885)

      Maybe.... who gives a shit?

    • (Score: 2) by RS3 on Saturday September 28, @08:26PM (1 child)

      by RS3 (6367) on Saturday September 28, @08:26PM (#1374936)

      NiCd (and NiMH) batteries will eventually grow internal "whiskers"- dead shorts from anode to cathode. Not sure about your laptop, but in the case that the batteries likely have shorted, that short will prevent the power supply from developing enough volts to allow it to run at all. Since it's likely a removable battery pack, have you tried removing the battery, running only on AC power pack?

      • (Score: 3, Interesting) by anubi on Saturday September 28, @11:47PM

        by anubi (2828) on Saturday September 28, @11:47PM (#1374948) Journal

        Growing Whiskers.

        Every one of my Makita power tool NiCd batteries failed from this phenomena. Within a year of purchase.

        It was the way I was using them: Fully charge. Put in drawer, wait six months, then use till it needed charging again. This caused the strongest cells to reverse-charge the weaker ones. 9.6V. Eight cells in series. No charge balancer. The weakest cell would whisker out, short, which fooled the charger into believing insufficient voltage on the battery pack, so it overcharged the hell out of the remaining seven.

        Another battery pack destroyed.

        --
        "Prove all things; hold fast that which is good." [KJV: I Thessalonians 5:21]
  • (Score: 2) by namefags_are_jerks on Sunday September 29, @05:05AM

    by namefags_are_jerks (17638) on Sunday September 29, @05:05AM (#1374967)

    "Hang on, someone did almost exactly the same thing 15 years ago..."
    http://dmitry.gr/?r=05.Projects&proj=07.%20Linux%20on%208bit [dmitry.gr]

    Ah, the same guy. (I'm bit revealed it wasn't someone just cloning his work..)

  • (Score: 2) by bart9h on Sunday September 29, @10:28PM (1 child)

    by bart9h (767) on Sunday September 29, @10:28PM (#1375036)

    > If it can run a C compiler, it can probably run DOOM.

    A C compiler is WAY lighter than DOOM.

    Also, if the compiler takes minutes to compile a hello world, it works. But if DOOM takes seconds to render a frame, you can't honestly say that it works, because you can't actually play it.

    • (Score: 2) by DannyB on Monday September 30, @03:43PM

      by DannyB (5839) Subscriber Badge on Monday September 30, @03:43PM (#1375124) Journal

      For certain definitions of "it works!".

      --
      Don't put a mindless tool of corporations in the white house; vote ChatGPT for 2024!
(1)