https://www.righto.com/2025/05/intel-386-register-circuitry.html
The groundbreaking Intel 386 processor (1985) was the first 32-bit processor in the x86 architecture. Like most processors, the 386 contains numerous registers; registers are a key part of a processor because they provide storage that is much faster than main memory. The register set of the 386 includes general-purpose registers, index registers, and segment selectors, as well as registers with special functions for memory management and operating system implementation. In this blog post, I look at the silicon die of the 386 and explain how the processor implements its main registers.
It turns out that the circuitry that implements the 386's registers is much more complicated than one would expect. For the 30 registers that I examine, instead of using a standard circuit, the 386 uses six different circuits, each one optimized for the particular characteristics of the register. For some registers, Intel squeezes register cells together to double the storage capacity. Other registers support accesses of 8, 16, or 32 bits at a time. Much of the register file is "triple-ported", allowing two registers to be read simultaneously while a value is written to a third register. Finally, I was surprised to find that registers don't store bits in order: the lower 16 bits of each register are interleaved, while the upper 16 bits are stored linearly.
(Score: 2, Insightful) by Anonymous Coward on Monday May 05, @03:52AM (9 children)
Love it or hate it, the 80386 changed the world.
(Score: 4, Informative) by ls671 on Monday May 05, @04:08AM (8 children)
Well for one, it enabled true unix like architectures on x86 and the 386 is one of the reasons Linux was born. 386 was a requirement for Linux.
Everything I write is lies, including this sentence.
(Score: 5, Informative) by bzipitidoo on Monday May 05, @04:27AM (2 children)
The 386 isn't good enough for Linux any more. The kernel dropped support for the 386 some time before kernel version 3.8.6, because semaphores are a total pain to implement on the 386 as it lacks an atomic instruction to test and immediately act. The 486 has several suitable instructions, such as CMPXCHG.
(Score: 2) by aafcac on Monday May 05, @09:28PM (1 child)
Well yeah, but so has FreeBSD and probably most other OSes that aren't targeting vintage hardware. Considering that the last Intel 386 chips were produced nearly 20 years ago, there isn't much point in hobbling performance by requiring that compatibility be maintained. And most software intended to be run on that process has been able to run in emulation for quite a while.
(Score: 4, Insightful) by bzipitidoo on Monday May 05, @11:51PM
There was no good reason for this lack in the 386. No technical barriers. Maybe there were legal barriers in the form of patents, but Intel could have overcome that. After all, they jammed in decimal arithmetic, using a really ugly and inferior method, to avoid patents. The problem was Intel. They wouldn't listen to OS designers. Their first chip intended to support a multitasking OS was the 286, and they botched that part of the design badly. The 386 was an improvement, but it ought to have been more. The 486 is the first one that wasn't downright painful. Why Intel screwed around like that, I don't know, but they got away with it. Should've got it right with the 286.
There was and is still a big lack: virtualization. The x86 architecture was stunningly late in providing decent support for more virtualization. We should never have needed software such as VMWare. It wasn't until 2005 that better support for virtualization was added to the architecture, and it still wasn't full. More improvements have trickled in over the years since. Yet even today PC hardware still can't fully virtualize peripherals.
(Score: 5, Insightful) by Rich on Monday May 05, @10:41AM (4 children)
The 68020 came out a year earlier, AND had proper atomic operations. That, paired with the 68851 MMU, would have worked as well or better (because of the more compiler-friendly register layout).
Motorola dropped the ball when they took too long to get the successors ('030, '040) out, and probably also priced themselves out of the market when Intel was trying to entrench a unified 32-bit DOS market and become monopolist.
On the '386 and other contemporary Intel designs that Ken Shirriff dissected, I am rather astonished how Intel went the opposite way of "Keep It Simple" and introduced complications everywhere for what might have been marginal gain promises.
(Score: 5, Insightful) by stormreaver on Monday May 05, @11:46AM (3 children)
My first Assembly was on the 6809, which had a beautiful design. When it became apparent in the late 80s/early 90s that Motorola had lost to Intel, I was forced to cave in to market pressures and surrender to Intel. My first thoughts upon seeing Intel Assembly was that it was a clusterfuck of horridness.
My opinion has not changed with time.
(Score: 5, Interesting) by Snotnose on Monday May 05, @12:44PM (2 children)
My first assembly was the Z-80 (TRS-80), then I went to the 8085. The Z80 was so much easier to program. Then I got to program a 68k and it was great.
It's too bad IBM chose the 8088, think of how better things would be if the old 68k architecture was the baseline instead of the 8086 (which was a royal pain due to segment registers).
Of course I'm against DEI. Donald, Eric, and Ivanka.
(Score: 2) by DannyB on Monday May 05, @02:38PM (1 child)
As a classic Macintosh guy in the 1980s, I found the 68000 to be a pleasure to program. It was simple. Had plenty of two kinds of registers, address and data registers. Nice flat memory model. No segment registers.
I had read somewhere, possibly in BYTE, but I'm not sure, that Intel put in the segment registers . . . wait for it . . .
. . . to maintain source code compatibility with 8080 assembler!
Yes they burdened the whole world for decades with segment registers just for that tiny short term gain.
I seem to now recall Vir wanting Morden's head on a pike as a warning to the next ten generations.
The only way to stop a bad guy with a can opener is a good guy with a can opener.
(Score: 4, Informative) by owl on Monday May 05, @08:04PM
Yeah, the segment registers certainly helped with that, although it was not "source compatibility" that was maintained. The segment registers enabled a "binary translator" that would consume 8080 object code and translate it to run on an 8086. No 'source' needed. [1]
But, also, at the time (1979 release, so design likely occurred from circa 1976 or 1977 to 1979) the computing world had not fully settled on "paged memory management" as the "one true path" for memory management. Segmented memory (although not as broken as Intel's version in the 8086) existed [2] and was a competitor to the "paged systems". And, in general, segmented systems require less complex hardware to implement. So, given the timeframe, the fact that segmentation had not lost out to paging as "the one true way" and the fact that segmentation is less complex to build (this last point is esp. important for the 8086) it is not, per se., wrong for Intel's designers to have picked segmentation. The one thing you can fault them on is the rather broken implementation they eventually devised, although that was also likely the result of needing it to be dirt simple and of the 8086 being a 'rush' job as a stand-in when it became clear the iAPX432 was going to be very delayed.
[1] Translation of 8080 Code to 8086 [retrocomputingforum.com]
[2] Segmented memory management [wikipedia.org]
(Score: 3, Insightful) by ese002 on Monday May 05, @02:51PM (1 child)
"Triple-ported" is three write ports. That is complicated, unusual, and not found in the 386. Two read + one write, however, is routine for register files. You need this for a typical instruction that reads two registers, compute a value, and write the result back to the register file.
(Score: 4, Informative) by owl on Monday May 05, @08:10PM
You do need it if you want your ALU operations to be single cycle.
If you are willing to allow your ALU ops to be multicycle, and you add two temporary registers to the ALU's inputs, you can use a single port register file with no issues. Your cycles go:
You'll take at least three cycles for every ALU operation, but you need only a single port register file.