Stories
Slash Boxes
Comments

SoylentNews is people

posted by hubie on Thursday June 01 2023, @04:59PM   Printer-friendly
from the everything-you-knew-was-a-lie dept.

https://www.devever.net/~hl/backstage-cast

If you take someone with intermediate knowledge of computing in the right areas, and ask them how an x86 machine boots, they'll probably start telling you about how the CPU first comes up in real mode and starts executing code from the 8086 reset vector of FFFF:FFF0. This understanding of how an x86 machine boots has remained remarkably persistent, as far as I can tell because this basic narrative about the boot process has been handed down from website to website, generation to generation, largely unchanged.

It's also a pack of lies and hasn't reflected the true nature of the boot process for some time. It's true the 8086 reset vector is still used, but only because it's a standard "ABI" for the CPU to transfer control to the BIOS (whether legacy PC BIOS or UEFI BIOS). In reality an awful lot happens before this reset vector starts executing. Aside from people having vaguely heard about the Intel Management Engine, this modern reality of the boot process remains largely unknown. It doesn't help that neither Intel nor AMD have really gone out of their way to actually document what the modern boot process looks like, and large parts of this process are handled by vendor-supplied mystery firmware blobs, which may as well be boxes with "???" written in them. Mainly we have the substantial assistance of assorted reverse engineers and security researchers to thank for the fact that we even have a decent picture of what the modoern x86 boot process actually looks like for both Intel and AMD. I could write a whole article about that process — but instead, I'd like to focus on something else.


Original Submission

This discussion was created by hubie (1068) for logged-in users only, but now has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 5, Interesting) by gznork26 on Thursday June 01 2023, @05:19PM (15 children)

    by gznork26 (1159) on Thursday June 01 2023, @05:19PM (#1309270) Homepage Journal

    The first one I learned about was on a DIGIAC 3080. You start by setting switches to load the first few bytes of memory, creating a simple program that reads from paper tape and stores the data in memory locations. At end-of-tape, it transferred control to the first address you loaded from the tape, and executes the boot loader that was on the tape. Hit the start button, the tape reads in, and the console wakes up. From there, you can load your actual assembler code from tape or cards, generate executable code in storage, and run your program.

    Senior year on high school, 1968-9, was fun.

    --
    Khipu were Turing complete.
    • (Score: 2) by istartedi on Thursday June 01 2023, @05:51PM (4 children)

      by istartedi (123) on Thursday June 01 2023, @05:51PM (#1309278) Journal

      Now *that's* old school. It's interesting you got to boot a computer back then. I know somebody who computed in that era, but it was remotely via teletype. Presumably the time-shared machine was kept up as long as possible, and only booted by the system operator. I don't know what the stats are, but I was always given to understand that my friend's experience was much more common than yours.

      --
      Appended to the end of comments you post. Max: 120 chars.
      • (Score: 3, Interesting) by gznork26 on Thursday June 01 2023, @06:37PM

        by gznork26 (1159) on Thursday June 01 2023, @06:37PM (#1309282) Homepage Journal

        Probably so. The school district pooled their money and set up one school to focus on technical electives, and another to focus on things like shop of various kinds, and students could picks a school for years 9-12. This way, they could have an excellent setup of each kind, rather than badly funded ones at several schools. I suspect they'd gotten the DIGIAC through some social engineering, because the company was not far away, and they were pitching it as a training unit, so getting one into a high school was probably good for their bottom line. The school was also working with CDC to set up a remote terminal to connect to their Cyber 6600 (I think it was) mainframe somewhere. When I was there, the computer course was a one semester elective. I checked back some years later, and it had become a 2-year program, so their approach seemed to have borne fruit.

        --
        Khipu were Turing complete.
      • (Score: 3, Interesting) by Reziac on Friday June 02 2023, @03:11AM (2 children)

        by Reziac (2489) on Friday June 02 2023, @03:11AM (#1309381) Homepage

        My high school here in Montana had an IBM1620 and was used for teaching; I took that class 1971-1972 (think we'd had it a couple years by then). Didn't know how the guts worked but I do remember the huge upgrade from punch cards to paper tape... took a fraction as long to boot up.

        We also had an IBM360 that did the district payroll.

        --
        And there is no Alkibiades to come back and save us from ourselves.
        • (Score: 2) by legont on Saturday June 03 2023, @05:49AM (1 child)

          by legont (4179) on Saturday June 03 2023, @05:49AM (#1309527)

          You probably remember it wrong as it was upgrade from paper tape to punch cards.

          --
          "Wealth is the relentless enemy of understanding" - John Kenneth Galbraith.
          • (Score: 3, Informative) by Reziac on Saturday June 03 2023, @06:27AM

            by Reziac (2489) on Saturday June 03 2023, @06:27AM (#1309543) Homepage

            Nope, I remember correctly. Originally it loaded the OS from punch cards, and it took about half an hour to boot up.

            Then we acquired a well-used paper tape reader (that was in 1972) and it seriously sped things up.

            Also had a 5mb hard drive the size of a washing machine, but didn't boot from that.

            But our 'programming' was still on punch cards. Fortran II D.

            Teacher fired it up when the building opened at 7am, and about half of that first hour class (me included) was right behind him, cuz we only had two card punch machines...

            --
            And there is no Alkibiades to come back and save us from ourselves.
    • (Score: 3, Interesting) by bzipitidoo on Thursday June 01 2023, @11:19PM (8 children)

      by bzipitidoo (4388) on Thursday June 01 2023, @11:19PM (#1309342) Journal

      Apple II is what I knew first. On machines without a floppy disk drive, it was basically instant on. Less than a second after flipping the power on, you got a command line prompt for the BASIC interpreter. Such fast starts are a feat that newer computers largely still can't match. However, that configuration is not much use, as there is no means to save your work to a floppy disk.

      With a floppy drive, the Apple II boot process was to jump to the ROM routine in the floppy drive controller card, customarily in slot 6, and therefore at address $C600. This code started the drive spinning by accessing the byte that was memory mapped to the drive motor, then pulled the arm from wherever it might be to track 0, accomplished by moving it 34 tracks towards 0. This part was real low level. There are 4 memory locations corresponding to pulses of the stepper motor that moves the arm, and these memory locations have to be accessed in a specific order and with a fairly small window of delay to get the arm to move. If the arm was closer to 0 than the farthest track, you'd hear the drive do its characteristic thudding as the arm bumped repeatedly against the stop. Next, it found sector 0, read into memory the 256 bytes contained there, then jumped to that code. At that point, the ROM is done. In standard DOS, the code from track 0 sector 0 was the program to read into memory the rest of the 16 (or 13 if using an earlier version of Apple DOS) sectors on track 0, which contained the code to move the drive arm so that the rest of DOS could be read into memory from tracks 1 and 2. Since this part of the process was read from the disk, this is also where most copy protection schemes start. After DOS is fully loaded, the system would read the first file into memory and transfer control to that. This is done by moving the arm to track 17, to read the directory, then to wherever on the disk the directory says that program is located. They picked track 17 for the directory because they thought having it at the center of the disk would reduce arm movement the most. So on boot, the user would hear the purr of the arm moving to track 0, a rising pitch, then the thuds, then 2 ticks as the arm moved to tracks 1 and 2, then a purr in a falling pitch as it was jumped to track 17. After that, the sounds it made were a bit more random.

      Because of a few boneheaded programming decisions, standard Apple DOS took about 45 seconds to load. The main problem was that the time it took to move a sector's worth of data to its final destination in memory caused the computer to just miss the start of the next sector, and it'd have to wait for the disk to make a full rotation to arrive again at the start of that next sector. Aftermarket DOSes were quick to fix this glaring inefficiency, and consequently, they all loaded in less than 15 seconds.

      The Apple II is an extremely simple machine compared to modern PCs. Modern PCs have tons of auxiliary processes handled largely independently by what amounts to complete subcomputers. Lot of stuff going on in parallel. On the Apple II, the 6502 CPU has to do everything.

      • (Score: 3, Interesting) by istartedi on Thursday June 01 2023, @11:48PM (3 children)

        by istartedi (123) on Thursday June 01 2023, @11:48PM (#1309349) Journal

        Did the Apple II not have smart peripherals?

        With the C-64, you got the same nearly instant boot to ROM, with the BASIC interpreter. If a 1541 was present, you could then load programs with, IIRC something like

        LOAD "MYPROG",8,1

        I have long since forgotten the significance of the 8,1; but the relevant thing is that whatever was involved with moving the heads on the floppy and other low-level aspects of the data transfer didn't tax the main CPU. Your data simply magically appeared at a standard location... I think maybe the first available user address.

        I don't think there was a lot of built-in file management like you'd expect with a "DOS", but since you were generally just loading one application and using it, that didn't matter. You could list files on the floppy, load them, and save under the name of your choice.

        There was no need to wait 45s for a DOS to load, you had instant access to files on the drive... but of course they took time to load.

        OTOH, I think it was still possible to break through and do the kind of low level manipulation you're talking about. There was a program called "Di-sector"... I think. This is all such a long time ago; but I remember people using Di-sector to recover lost data, or subvert copy protection.

        Did you ever turn a floppy over and notch the other side? LOL.

        --
        Appended to the end of comments you post. Max: 120 chars.
        • (Score: 4, Informative) by looorg on Friday June 02 2023, @12:13AM

          by looorg (578) on Friday June 02 2023, @12:13AM (#1309353)

          LOAD "MYPROG",8,1

          I have long since forgotten the significance of the 8,1; but the relevant thing is that whatever was involved with moving the heads on the floppy and other low-level aspects of the data transfer didn't tax the main CPU. Your data simply magically appeared at a standard location... I think maybe the first available user address.

          The ,8 is just LOAD from device 8. Which is normally the first disk drive. The ,1 is to load it into an absolute memory address specified in the program instead of just the standard location (start of basic memory). For your own small BASIC programs ,8 is usually enough. But if you write in ASM or it's a larger program that requires a bit more memory and such then you need to load it into a specific space usually. Normally then started by SYS into the address instead of just RUN.

          The C64 "DOS" if you will is very ... mmm .. limited and arcane in that regard to put it mildly. There is LOAD and SAVE but that is more or less it. There wasn't a lot of file manipulation in the BASIC v2 beyond that unless you wanted to go beyond the normal and easy and learn some somewhat arcane OPEN commands used to delete/scratch files, rename files, format disks etc.

        • (Score: 3, Informative) by looorg on Friday June 02 2023, @12:58AM

          by looorg (578) on Friday June 02 2023, @12:58AM (#1309358)

          > ... the floppy and other low-level aspects of the data transfer didn't tax the main CPU
          That is due to the drive (1541 etc) basically being their own computer with their own CPU, there is a 6502 in the disk drive. I seem to recall that if you had multiple drives you could have them interact with each other, such as copying data, and the actual computer was not needed anymore.

        • (Score: 2) by sjames on Friday June 02 2023, @01:44AM

          by sjames (2882) on Friday June 02 2023, @01:44AM (#1309363) Journal

          On Apple, the host CPU ran the floppy using PIO. In the C64, the floppy drive had a 6502 of it's own that ran the drive in PIO mode and talked to the host by bit-banging the peripheral connection. By default, that was used as a serial connection with a fairly over-designed handshake protocol. The Fastload speeded it up by using the connection as a 2 bit parallel connection.

          The host could upload and run small chunks of code on the floppy drive, allowing the fastload to work. Things like Di-sector could also use that for low level access.

          I think most everybody notched the other side and I don't think anyone ever faced the gloom and doom the floppy manufacturers claimed would happen.

      • (Score: 2) by sjames on Friday June 02 2023, @02:03AM (3 children)

        by sjames (2882) on Friday June 02 2023, @02:03AM (#1309370) Journal

        And just to further complicate things, the floppy hardware couldn't correctly read too many 0 bits in a row, so it used a group encoding scheme where each 8 bits encoded 5 (and later 6) bits of actual data with 2 reserved values to indicate sector headers. In addition, sectors would start with several 0xff bytes to sync up the read.

        Some early copy protection just used non-standard sector headers on all but the boot track.

        As you say, the Apple ][ did everything with the CPU. Of course, the CPU and memory were much simpler. Once power was on, RAM worked. In a PC, there is no working RAM until the memory controller AND the DIMMs themselves are brought up. Then the various bus and peripheral controllers have to be initialized.

        • (Score: 2) by bzipitidoo on Friday June 02 2023, @04:19AM (2 children)

          by bzipitidoo (4388) on Friday June 02 2023, @04:19AM (#1309388) Journal

          That's right, the floppy disk used an encoding scheme so that problematic bit patterns would never appear on the disk. That move from 5 to 6 bits is why the number of sectors per track improved from 13 to 16.

          I broke Origin's copy protection on Ultima 4 and Moebius. Once that was done, I hacked the games. Changed the graphics in Ultima 4, just for fun. In Moebius, they'd screwed up the disk access nearly as badly as stock Apple DOS. Every time you got into a fight, the game had to load a separate program, and that took 20 seconds. With sector interleaving, I got that load time down to 5 seconds.

          • (Score: 2) by sjames on Friday June 02 2023, @08:33AM (1 child)

            by sjames (2882) on Friday June 02 2023, @08:33AM (#1309407) Journal

            I remember doing boot code tracing on the Apple ][ and later the PC. Load the first stage, examine the disassembly. Hack it to load but not jump to the next stage then run it. Repeat until you load the actual game and know it's entry point, then save it as a binary file.

            Sometimes you could just NOP over code that made sure selected disk sectors caused read errors to make the disk copyable. Then all the crazy schemes like half tracks, missing sync bytes, etc.

            The hacking was more fun than many of the games.

            • (Score: 2) by bzipitidoo on Friday June 02 2023, @02:07PM

              by bzipitidoo (4388) on Friday June 02 2023, @02:07PM (#1309433) Journal

              Yes, LOL. Boot tracing is exactly how I did it. And it was fun.

              Only the boot disk of Origin's games was copy protected. The other disks were in the standard format. I reasoned that if their code could read both the standard format and their copy protected one, all I had to do was copy the data off the copy protected disk and write it to another disk in the standard format. Wouldn't have to change a thing in the code itself. I was right. I used boot tracing to load their modified DOS into memory, then used that to read their copy protected disk a few tracks at a time, and flip to standard DOS to write the tracks back out to another disk. Took several iterations to copy the entire disk, memory being limited.

              After that was done, came the fun of using a sector editor to find whichever game elements I wanted to change, figuring out the format, making changes, then checking that I got it right. Sometimes I fixed bugs, and sometimes improved performance.

    • (Score: 2) by legont on Saturday June 03 2023, @05:47AM

      by legont (4179) on Saturday June 03 2023, @05:47AM (#1309526)

      I've done this too but even simpler. They were not switches though but buttons in 8 bit. I had to enter a few commands into memory to make the 4 bit tape reader get my program in. After that I just enter the address where I believed my shit starts and hit run. Off course, there were useful utilities I'd load from a common tape first.

      --
      "Wealth is the relentless enemy of understanding" - John Kenneth Galbraith.
  • (Score: 4, Insightful) by hendrikboom on Thursday June 01 2023, @08:24PM

    by hendrikboom (1125) Subscriber Badge on Thursday June 01 2023, @08:24PM (#1309308) Homepage Journal

    The article uses IBM's Power 9 processors because their documentation shows the mechanisms that other manufacturers hide.

    There's even an open-source implementation underway of an OpenPower processor: libre-SOC [libre-soc.org]. No secrets here.

  • (Score: 3, Informative) by turgid on Thursday June 01 2023, @08:53PM

    by turgid (4318) Subscriber Badge on Thursday June 01 2023, @08:53PM (#1309318) Journal

    Back in the 90s is when CPUs came on the market that ran x86 code but internally were not x86. They were RISC CPUs with a translation layer on top (in hardware). Companies like NexGen, Cyrix and AMD all had such CPUs. They were able to out-compute the intel Pentium at lower clock speeds. Then intel developed the Pentium Pro (which became the Pentium II) which was RISC internally. It's not surprising that a whole lot of other stuff goes on before the x86 decoder starts running. There was even a CPU that had the x86 translation in firmware. That was the Transmeta Crusoe and a certain Linus Torvalds used to work for them I seem to remember. They used to say that their hadn't been a genuine number 1 hit single since the Beatles split up. There hasn't been a genuine x86 CPU since the Pentium went end-of-life.

  • (Score: 5, Informative) by coolgopher on Friday June 02 2023, @02:05AM (1 child)

    by coolgopher (1157) on Friday June 02 2023, @02:05AM (#1309371)

    I'd upvote this article - that was a good read!

    • (Score: 3, Interesting) by janrinok on Friday June 02 2023, @12:29PM

      by janrinok (52) Subscriber Badge on Friday June 02 2023, @12:29PM (#1309421) Journal
      Thank you - any feedback on the stories that we display is welcome. If we are getting it wrong - which we do! - then that is just as important as knowing that we are getting it right. We prefer the latter for purely fuzzy warm-feelings though...
  • (Score: 2) by maxwell demon on Saturday June 03 2023, @09:51AM

    by maxwell demon (1608) on Saturday June 03 2023, @09:51AM (#1309567) Journal

    and starts executing code from the 8086 reset vector of FFFF:FFF0

    Actually, the reset vector is FFFF:0000, resulting in the address FFFF0, close to the top of the memory the 8086 could address (remember, 8086/x86 real mode addresses are calculated as 16*segment + offset). The vector FFFF:FFF0 would result in the address 10FFE0, beyond the 8086 address range. In the 8086, or any later system with disabled A20 line, this would wrap around to the address 0FFE0 near the beginning of the memory. The PC platform has RAM at this address, which would not be a good idea to start executing from before anything has been written there.

    --
    The Tao of math: The numbers you can count are not the real numbers.
(1)