https://www.righto.com/2020/06/die-shrink-how-intel-scaled-down-8086.html
The revolutionary Intel 8086 microprocessor was introduced 42 years ago this month so I've been studying its die.1 I came across two 8086 dies with different sizes, which reveal details of how a die shrink works. The concept of a die shrink is that as technology improved, a manufacturer could shrink the silicon die, reducing costs and improving performance. But there's more to it than simply scaling down the whole die. Although the internal circuitry can be directly scaled down,2 external-facing features can't shrink as easily. For instance, the bonding pads need a minimum size so wires can be attached, and the power-distribution traces must be large enough for the current. The result is that Intel scaled the interior of the 8086 without change, but the circuitry and pads around the edge of the chip were redesigned.
[...] The photo above shows the two 8086 dies at the same scale. The two chips have identical layout in the interior,7 although they may look different at first. The chip on the right has many dark lines in the middle that don't appear on the left, but this is an artifact. These lines are the polysilicon layer, underneath the metal; the die on the left has the same wiring, but it is very faint. I think the newer chip has a thinner metal layer, making the polysilicon more visible.
The magnified photo below shows the same circuitry on the two dies. There is an exact correspondence between components in the two images, showing the circuitry was reduced in size, not redesigned. (These photos show the metal layer on top of the chip; some polysilicon is visible in the right photo.)
I have decided to combine this part of the 8086 story because, as the author points out, there is a significant overlap with an earlier part which explained the multiplication code. [JR]
https://www.righto.com/2023/04/reverse-engineering-8086-divide-microcode.html
While programmers today take division for granted, most microprocessors in the 1970s could only add and subtract — division required a slow and tedious loop implemented in assembly code. One of the nice features of the Intel 8086 processor (1978) was that it provided machine instructions for integer multiplication and division. Internally, the 8086 still performed a loop, but the loop was implemented in microcode: faster and transparent to the programmer. Even so, division was a slow operation, about 50 times slower than addition.
I recently examined multiplication in the 8086, and now it's time to look at the division microcode.1 (There's a lot of overlap with the multiplication post so apologies for any deja vu.)
(Score: 4, Insightful) by owl on Monday April 10, @03:21PM
Yes, the 8086 built in division instructions took a lot of cycles to complete. But they had the advantage that if one was writing 8086 code, one did not have to find, or write from scratch, a general purpose division routine. The CPU already 'knew how' to divide. Which compared to the 8086's other competitors of the era (6502, 6800, Z80, TI 99/4a) was a benefit.
Yes, although the answer was often "it depends". For a general purpose, 8 or 16 bit, divide any number by any number (excluding divide by zero obviously) using the CPU instruction was not always slower (unless one's alternate 'by hand' version was of a much more advanced algorithm than the one used by the 8086.
For division by known amounts (i.e., one is always dividing by 12) then yes, doing the division by hand, usually with some shifts and add or subtract, was often significantly faster. But then in this case one does not have a "general purpose division" routine, one has a "special purpose divide by 12" routine. Not quite an apples to apples comparison either.
It did not, which is why division by powers of 2, which are also simply shifts, were much faster to code as shits by hand. Had the microcode checked for dividing by a power of two, and substituted shifts for the full divide algorithm, the CPU would have been much faster at processing its divide instructions, for those divisors. And as dividing by 2 and 4 are often common, it would have been an optimization that would have been valuable.