Stories
Slash Boxes
Comments

SoylentNews is people

posted by janrinok on Friday March 24, @06:33PM   Printer-friendly

http://www.righto.com/2023/03/8086-multiplication-microcode.html

While programmers today take multiplication for granted, most microprocessors in the 1970s could only add and subtract — multiplication required a slow and tedious loop implemented in assembly code. One of the nice features of the Intel 8086 processor (1978) was that it provided machine instructions for multiplication,2 able to multiply 8-bit or 16-bit numbers with a single instruction. Internally, the 8086 still performed a loop, but the loop was implemented in microcode: faster and transparent to the programmer. Even so, multiplication was a slow operation, about 24 to 30 times slower than addition.

In this blog post, I explain the multiplication process inside the 8086, analyze the microcode that it used, and discuss the hardware circuitry that helped it out.3 My analysis is based on reverse-engineering the 8086 from die photos. The die photo below shows the chip under a microscope. I've labeled the key functional blocks; the ones that are important to this post are darker. At the left, the ALU (Arithmetic/Logic Unit) performs the arithmetic operations at the heart of multiplication: addition and shifts. Multiplication also uses a few other hardware features: the X register, the F1 flag, and a loop counter.


Original Submission

 
This discussion was created by janrinok (52) for logged-in users only, but now has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Interesting) by guest reader on Friday March 24, @09:46PM (1 child)

    by guest reader (26132) Subscriber Badge on Friday March 24, @09:46PM (#1298053)

    It seems that multiplication is now as fast or even faster (in real arithmetic) than addition.

    From this article in "Later advances in multiplication":

    - The 8086 took up to 133 clock cycles to multiply unsigned 16-bit values.
    - By 1982, the Intel 286 processor cut this time down to 21 clock cycles.
    - The Intel 486 (1989) used an improved algorithm that could end early, so multiplying by a small number could take just 9 cycles.
    - The Cyrix Cx486SLC (1992) had a 16-bit hardware multiplier that cut word multiply down to 3 cycles.
    - The Intel Core 2 (2006) was even faster, able to complete a 32-bit multiplication every clock cycle.

    Starting Score:    1  point
    Moderation   +1  
       Interesting=1, Total=1
    Extra 'Interesting' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3  
  • (Score: 3, Interesting) by owl on Saturday March 25, @01:57AM

    by owl (15206) Subscriber Badge on Saturday March 25, @01:57AM (#1298081)

    The Intel Core 2 (2006) was even faster, able to complete a 32-bit multiplication every clock cycle.

    Brought about by the much larger transistor budget available. Just the hardware multiplier circuit for a Core 2 CPU is likely more transistors that the total transistor count quoted for the 8086 (29,000 transistors).

    Lots of performance can often be found, provided there are enough transistors available to build the necessary hardware.