Slash Boxes

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 18 submissions in the queue.
posted by NCommander on Tuesday August 30 2016, @12:14PM   Printer-friendly
from the int-21h-is-how-cool-kids-did-it dept.

I've made no secret that I'd like to bring original content to SoylentNews, and recently polled the community on their feelings for crowdfunding articles. The overall response was somewhat lukewarm mostly on dividing where money and paying authors. As such, taking that into account, I decided to write a series of articles for SN in an attempt to drive more subscriptions and readers to the site, and to scratch a personal itch on doing a retro-computing project. The question then became: What to write?

As part of a conversation on IRC, part of me wondered what a modern day keylogger would have looked running on DOS. In the world of 2016, its no secret that various three letter agencies engage in mass surveillance and cyberwarfare. A keylogger would be part of any basic set of attack tools. The question is what would a potential attack tool have looked like if it was written during the 1980s. Back in 1980, the world was a very different place both from a networking and programming perspective.

For example, in 1988 (the year I was born), the IBM PC/XT and AT would have been a relatively common fixture, and the PS/2 only recently released. Most of the personal computing market ran some version of DOS, networking (which was rare) frequently took the form of Token Ring or ARCNet equipment. Further up the stack, TCP/IP competed with IPX, NetBIOS, and several other protocols for dominance. From the programming side, coding for DOS is very different that any modern platform as you had to deal with Intel's segmented architecture, and interacting directly with both the BIOS, and hardware. As such its an interesting look at how technology has evolved since.

Now obviously, I don't want to release a ready-made attack tool to be abused for the masses especially since DOS is still frequently used in embedded and industry roles. As such, I'm going to target a non-IP based protocol for logging both to explore these technologies, while simultaneously making it as useless as possible. To the extent possible, I will try and keep everything accessible to non-programmers, but this isn't intended as a tutorial for real mode programming. As such I'm not going to go super in-depth in places, but will try to link relevant information. If anyone is confused, post a comment, and I'll answer questions or edit these articles as they go live.

More past the break ...

Looking At Our Target

Back in 1984, IBM released the Personal Computer/AT which can be seen as the common ancestor of all modern PCs. Clone manufacturers copied the basic hardware and software interfaces which made the AT, and created the concept of PC-compatible software. Due to the sheer proliferation of both the AT and its clones, these interfaces became a de-facto standard which continues to this very day. As such, well-written software for the AT can generally be run on modern PCs with a minimum of hassle, and it is completely possible to run ancient versions of DOS and OS/2 on modern hardware due to backwards compatibility.

A typical business PC of the era likely looked something like this:

  • An Intel 8086 or 80286 processor running at 4-6 MHz
  • 256 kilobytes to 1 megabyte of RAM
  • 5-20 MiB HDD + 5.25 floppy disk drive
  • Operating System: DOS 3.x or OS/2 1.x
  • Network: Token Ring connected to a NetWare server, or OS/2 LAN Manager
  • Cost: ~$6000 USD in 1987

To put that in perspective, many of today's microcontrollers have on-par or better specifications than the original PC/AT. From a programming perspective, even taking into account resource limitations, coding for the PC/AT is drastically different from many modern systems due to the segmented memory model used by the 8086 and 80286. Before we dive into the nitty-gritty of a basic 'Hello World' program, we need to take a closer look at the programming model and memory architecture used by the 8086 which was a 16-bit processor.

Real Mode Programming

If the AT is the common ancestor of all PC-compatibles, then the Intel 8086 is processor equivalent. The 8086 was a 16-bit processor that operated at a top clock speed of 10 MHz, had a 20-bit address bus that supported up to 1 megabyte of RAM, and provided fourteen registers. Registers are essentially very fast storage locations physically located within the processor that were used to perform various operations. Four registers (AX, BX, CX, and DX) are general purpose, meaning they can be used for any operation. Eight (described below) are dedicated to working with segments, and the final registers are the processor's current instruction pointer (IP), and state (FLAGS) An important point in understanding the differences between modern programming environments and those used by early PCs deals with the difference between 16-bit and 32/64-bit programming. At the most fundamental level, the number of bits a processor has refers to the size of numbers (or integers) it works with internally. As such, the largest possible unsigned number a 16-bit processor can directly work with is 2 to the power of 16 (minus 1) or 65,535. As the name suggests, 32-bit processors work with larger numbers, with the maximum being 4,294,967,296. Thus, a 16-bit processor can only reference up to 64 KiB of memory at a given time while a 32-bit processor can reference up to 4 GiB, and a 64-bit processor can reference up to 16 exbibytes of memory directly.

At this point, you may be asking yourselves, "if a 16-bit processor could only work with 64 KiB RAM directly, how did the the 8086 support up to 1 megabyte?" The answer comes from the segmented memory model. Instead of directly referencing a location in RAM, addresses were divided into two 16-bit parts, the selector and offset. Segments are 64 kilobyte selections of RAM. They could generally be considered the computing equivalent of a postal code, telling the processor where to look for data. The offset then told the processor where exactly within that segment the data it wanted was located. On the 8086, the selector represented the top 16-bits of an address, and then the offset was added to it to create 20-bits (or 1 megabyte) of addressable memory. Segments and offsets are referenced by the processor in special registers; in short you had the following:

  • Segments
    • CS: Code segment - Application code
    • DS: Data segment - Application data
    • SS: Stack segment - Stack (or working space) location
    • ES: Extra segment - Programmer defined 'spare' segment
  • Offsets
    • SI - Source Index
    • DI - Destination Index
    • BP - Base pointer
    • SP - Stack pointer

As such, memory addresses on the 8086 were written in the form of segment:offset. For example, a given memory address of 0x000FFFFF could be written as F000:FFFF. As a consequence, multiple segment:offset pairs could refer to the same bit of memory; the addresses F555:AAAF, F000:FFFF, and F800:7FFF all refer to the same bit of memory. The segmentation model also had important performance and operational characteristics to consider.

The most important was that since data could be within the same segment, or a different type of segment, you had two different types of pointers to work with them. Near pointers (which is just the 16-bit offset) deal with data within the same segment, and are very fast as no state information has to be changed to reference them. Far pointers pointed to data in a different selector and required multiple operations to work with as you had to not only load and store the two 16-bit components, you had to change the segment registers to the correct values. In practice, that meant far pointers were extremely costly in terms of execution time. The performance hit was bad enough that it eventually lead to one of the greatest (or worst) backward compatibility hacks of all time: the A20 gate, something which I could write a whole article on.

The segmented memory model also meant that any high level programming languages had to incorporate lower-level programming details into it. For example, while C compilers were available for the 8086 (in the form on Microsoft C), the C programming language had to be modified to work with the memory model. This meant that instead of just having the standard C pointer types, you had to deal with near and far pointers, and the layout of data and code within segments to make the whole thing work. This meant that coding for pre-80386 processors required code specifically written for the 8086 and the 80286.

Furthermore, most of the functionality provided by the BIOS and DOS were only available in the form of interrupts. Interrupts are special signals used by the process that something needs immediate attention; for examine, typing a key on a keyboard generates a IRQ 1 interrupt to let DOS and applications know something happened. Interrupts can be generated in software (the 'int' instruction) or hardware. As interrupt handling can generally only be done in raw assembly, many DOS apps of the era were written (in whole or in part) in intel assembly. This brings us to our next topic: the DOS programming model

Disassembling 'Hello World'

Before digging more into the subject, let's look at the traditional 'Hello World' program written for DOS. All code posted here is compiled with NASM

; Hello.asm - Hello World

section .text
org 0x100

 mov ah, 9
 mov dx, str_hello
 int 0x21

section .data
str_hello: db "Hello World",'$'

Pretty, right? Even for those familiar with 32-bit x86 assembly programming may not be able to understand this at first glance what this does. To prevent this from getting too long, I'm going to gloss over the specifics of how DOS loads programs, and simply what this does. For non-programmers, this may be confusing, but I'll try an explain it below.

The first part of the file has the code segment (marked 'section .text' in NASM) and our program's entry point. With COM files such as this, execution begins at the top of file. As such, _entry is where we enter the program. We immediately execute two 'mov' instructions to load values into the top half of AX (AH), and a near pointer to our string into DX. Ignore 9 for now, we'll get to it in a moment. Afterwords, we trip an interrupt, with the number in hex (0x21) after it being the interrupt we want to trip. DOS's functions are exposed as interrupts on 0x20 to 0x2F; 0x21 is roughly equivalent to stdio in C. 0x21 uses the value in AX to determine which subfunction we want, in this case, 9, to write to console. DOS expects a string terminated in $ in DX; it does not use null-terminated strings like you may expect. After we return from the interrupt, we simply exit the program by calling ret.

Under DOS, there is no standard library with nicely named functions to help you out of the box (though many compilers did ship with these such as Watcom C). Instead, you have to load values into registers, and call the correct interrupt to make anything happen. Fortunately, lists of known interrupts are available to make the process less painful. Furthermore, DOS only provides filesystem and network operations. For anything else, you need to talk to the BIOS or hardware directly. The best way to think of DOS from a programming perspective is essentially an extension of the basic input/output functionality that IBM provided in ROM rather than a full operating system.

We'll dig more into the specifics on future articles, but the takeaway here is that if you want to do anything in DOS, interrupts and reference tables are the only way to do so.


As an introduction article, we looked at the basics of how 16-bit real mode programming works and the DOS programming model. While something of a dry read, it's a necessary foundation to understand the basic building blocks of what is to come. In the next article, we'll look more at the DOS API, and terminate-and-stay resident programs, as well as hooking interrupts.

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Interesting) by LoRdTAW on Tuesday August 30 2016, @01:59PM

    by LoRdTAW (3755) on Tuesday August 30 2016, @01:59PM (#395273) Journal

    I am going to assume it would work like this:
    - Hijack interrupt 09, the keyboard interrupt.
    - dump a copy of the keystrokes into a buffer.
    - when the buffer is full call some network code to transmit the buffer.
    - Most likely the network adapter is NE2000. The protocol is what, IPX? Or raw Ethernet packets?

    Time to nerd out a bit...
    i86 assembler in DOS along with building an 8088 board in college were probably the most useful courses I ever took. The board they gave us was awful though. The original was hamstrung with only a single set of 8 dip switches and 8 LED's for I/O. The only memory outside of the four CPU registers was a 2k eprom. Thats right, eprom. Messed up your program because you forgot to put the fucking code at 0xFFFF0? (reset vector, the first addresss the CPU looks for instructions after a reset or cold boot) Well then! Put that little bastard in the UV eraser. For 20 minutes. I used to smoke back then so I'd go out for a smoke and a short walk while cursing a little. The code was assembled in DOS's debug in an VMware DOS VM. Then the binary machine code was hand entered into a windows based USB EPROM burner. Fucking painful only begins to describe programming this thing. I'd say half of the classes time was wasted trying to get the device programmed.

    I both loved the board and hated it. About half way through the course fater the board was built and I started programming, I began to redesign the board in KiCAD. I kept the 5x6 inch footprint and greatly improved on the peripherals including providing a standard 16 pin header for a 20x2 LCD and an I/O expansion port. I kept the dip switches and 8 bit bar graph LED's, positioned the LCD right above them, and added a 40 pin header to break out the address bus, data bus, interrupt line, I/O control lines, and clock signals. I also expanded the chip select ability for the I/O ports using a demux. The best part was I added a ZIF socket and upgraded to flash memory. I can't remember but I think a 28 pin flash chip has the same pin-out if it was an 8, 16 or 32kB part. So I added a jumper to select the ROM size if you wanted to go beyond the 8kB default. This was my first PCB design attempt. Ever. Using just two sides for traces (cheated routing by using the then available free router). Awesome learning experience that I started on my own in parallel with the course. I even kept the BOM cost within the course's original $50 student material fee. I had a prototype board fabricated by Futurlec (slow, hard to communicate with, okay quality but CHEAP!). Turns out I made a footprint assignment mistake which halted the build and I stopped at building the clock generator circuit. By then it was half way through the summer after class had already ended :(

    My next idea was to build an expansion board with dual SRAM and flash. There were two SRAM sockets each hard wired for 8kb SRAM chips for 16KB total. One of the SRAM sockets had a battery backup controller from maxim which could be omitted. Two jumper switched flash sockets with a write circuit I rigged up to send 12V to the write enable pin which put the chip into write mode. I forgot how I mapped those flash chips into address space as I think I was trying to keep everything in one 64kB segment. This board I think made it to PCB layout but not sure if I finalized it.

    The I/O boards were stackable and my next board was an I/O board that could sit atop the memory board. It contained an 8bit R-2R DAC with a 0-5V op-amp buffered output, a 0-5/10V successive approximation ADC using the system clock, 8 bit PWM output, and 8 opto-isolated inputs and outputs. I also thought of mixing relays with the outputs as well. I designed everything and is was to be built using dip chips. This board had a more basic version with only non isolated digital I/O and the ADC, PWM circuit make it to PCB layout. The final I/O packed version never made it past the paper sketches.

    My original plans were to get the department to switch to my board design, and use a new assembler tool chain along with a basic C compiler (BCC, Bruce's C compiler) with a bare bones non-standard C library to provide basic functionality without RAM such as on board I/O reading/writing, and an LCD printf. I had a function which could for example take an 8 bit number and display its binary representation on the LED bar graph. Same with reading the dip switches. I also began writing the LCD printf called printlcd(const char *format); not variadic of course. That was in the plan though. I even began work on the tool chain by simulating the hardware in a program called emu8086. I wrote some of the C library and simulated it with success. Had a basic read dip switches and write to the LED lopp and a very basic LCD print function working.

    Beyond those basic building blocks I also planned on writing a command interpreter and adding a com port to hook to a terminal emulator. The idea was to let students pop in a pre-flashed firmware chip, have it boot and display diagnostics on the LCD, and if you connected it to a terminal emulator, get a basic command prompt with the ability to upload a program into RAM and execute it. The idea was to use base64 encoding and you could load that code into the battery backed SRAM and use the volatile SRAM for variables, buffers and such. Since we got to keep the boards, the idea was to make it more interactive and even useful to the student fater class. So making it more arduino/PLC/early 8bit PC like was certainly a much better idea. The course didn't have to go beyond the basic programming stuff and the LCD along with pre-built I/O boards could be kept in the lab and given to students on an as needed basis. Teh board could also have been used in the crappy embedded c++ course I took that sucked. Half of the semester was spent in front of visual studio writing bad c++ code (some half assed c with classes approach) and the second half was spent in front of a goofy 8085 board that was programmed in c and assembler. I could have had the students sitting in front of the SAME 8088 board from their CPU class, on a Linux machine or VM and immediatly building code and seeing results. They could use prebuilt boards or bring in their own board depending on how they scheduled that course.

    Unfortunately ADD/Anxiety got the better of me after the board design flaw, stopped going to college after that semester and I moved shortly after. The damn thing is still in a box somewhere. I keep telling myself I'm going to fucking finish that thing one day. The sad part is I showed the final circuit design as well as the 3D rendered output of the PCB to my professor during the last week of class and he was floored. He loved the idea and how I even went as far as keeping the BOM cost just below the student fee. Said I should take it right to the head of the tech department. I never did. Story of my life.

    Starting Score:    1  point
    Moderation   +1  
       Interesting=1, Total=1
    Extra 'Interesting' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3  
  • (Score: 3, Informative) by NCommander on Tuesday August 30 2016, @02:31PM

    by NCommander (2) Subscriber Badge <> on Tuesday August 30 2016, @02:31PM (#395292) Homepage Journal

    Well I haven't decided where I want to hook just yet. The keyboard controller itself is IRQ 1, but the BIOS exposes it as 9 for sanity reasons in the IVR. If I hook IRQ 1, I can catch absolutely everything, but have to deal with scancodes and then feed it back into BIOS. That makes it harder to detect, but much harder to code. Hooking 9 puts me between BIOS and DOS is saner, since I don't have to do scan code -> ASCII mapping, and I can't remember off hand if I can read a scan code twice (aka, if I read it in the IRQ 1 ISR, can it be read further down the line).

    My rough plan of attack is to allocate a small static buffer which the ISR pushes the value in and then chains, then hook either network I/O operations or the DOS IDLE task. To avoid the DOS re-entrant problem, I was going to clobber the IRET value and kick out of the ISR. The upload code would then operate as a normal DOS app, send the static buffer, call back into the TSR to say its done, and restore the original exception vector. My concern is if I try to do network I/O operations in a ISR, I'm going to break something.

    Still always moving
    • (Score: 1) by tekk on Tuesday August 30 2016, @03:41PM

      by tekk (5704) Subscriber Badge on Tuesday August 30 2016, @03:41PM (#395306)

      My gut feeling is that you probably can't read twice, but what do you want to bet there's some way of pushing it back on. I'm not an expert in how this works but given how there's no memory protection couldn't you just proxy the DOS routine too? Change that code pointer to point to you and act like the DOS routine to the outside world while doing your stuff.

      • (Score: 0) by Anonymous Coward on Tuesday August 30 2016, @10:26PM

        by Anonymous Coward on Tuesday August 30 2016, @10:26PM (#395476)

        couldn't you just proxy the DOS routine too? Change that code pointer to point to you and act like the DOS routine to the outside world while doing your stuff.

        That is the polite way to handle an interrupt when you program for DOS, but you must call the old routine only if the interrupt flask is unset on that interrup at the time the handler is installed

        • (Score: 2) by Post-Nihilist on Tuesday August 30 2016, @11:13PM

          by Post-Nihilist (5672) on Tuesday August 30 2016, @11:13PM (#395488)

          interrupt flask

          sure sign that I need a drink.... I meant mask

          Be like us, be different, be a nihilist!!!
    • (Score: 2) by Post-Nihilist on Tuesday August 30 2016, @10:11PM

      by Post-Nihilist (5672) on Tuesday August 30 2016, @10:11PM (#395473)

      On an AT PC, the keyboard is on IRQ2 (starting from 1) and IRQ2 call int 9
      IRQ1 is the tick timer usually called at a precise 55ms interval. it is mapped on int 8.
      IRQ3 is chained to another PIC controller.
      int 1 is the Single Step interrupt, when the trap flag is set, it is called back after each instruction executed.

      Be like us, be different, be a nihilist!!!
  • (Score: 2) by tibman on Tuesday August 30 2016, @02:34PM

    by tibman (134) Subscriber Badge on Tuesday August 30 2016, @02:34PM (#395293)

    You might enjoy taking another look at it. Prices for old dip stuff is pretty reasonable. You can buy parts in single quantities and casually work on breadboarding it out. Since you mentioned Arduino then you are probably still doing electronics? As a hobby or professionally?
    Ram example: []

    I'm breadboarding an MC68000 based computer. Terrible at electronics but love programming : )

    SN won't survive on lurkers alone. Write comments.
    • (Score: 2) by LoRdTAW on Tuesday August 30 2016, @04:59PM

      by LoRdTAW (3755) on Tuesday August 30 2016, @04:59PM (#395344) Journal

      I'm working with this type of stuff at work. I have really taken a liking to industrial automation as I have been working with a lot of CNC stuff and PLC/PAC's. In fact, I'm rebuilding a whole CNC Laser system we got from a customer who was going to toss the whole system into the dumpster. Seriously, a whole Aerotech A3200 three axis CNC system with the XYZ stage and a 500W JK701 NdYAG laser. I am working with our machinist to make it a dedicated workstation for a customer who is sending in a massive order to weld fuel pumps. I have both learned and taught myself a whole lot working here. And in addition to lasers and CNC systems, I have also learned a lot about electron beam welders, their high voltage systems, electron gun design and even high vacuum systems.

      One of my biggest influences in my CPU board design and the multi I/O card was a sort of amalgamation of this newfound love of from scratch computer building, industrial automation, and the simplicity of the arduino/AVR and other micro controllers as well as this guy: [] (unfortunately a lot of the picture links are broken). The dude builds everything himself. He even built a small plastic injection molding machine complete with a from scratch built M68k controller. I'll say this, as much as I love the idea of feature packed micros, there is something so absolutely satisfying about building a computer which then controls something you made. That's why my I/O board Idea was based on PLC I/O and the API was to be Arduino like. You could take that board home and make it control your furnace or blinds or whatever else you wanted. Hell, I even went as far as thinking the serial port could also do modbus.

      I'm breadboarding an MC68000 based computer. Terrible at electronics but love programming : )

      I'm the opposite. Good with electronics, mediocre at programming. But that is simply because I don't get enough time to do any actual programming. The M68k is a CPU I never worked with but have really wanted to work with for a while.

  • (Score: 0) by Anonymous Coward on Tuesday September 20 2016, @06:41PM

    by Anonymous Coward on Tuesday September 20 2016, @06:41PM (#404402)

    God, I hate it when idiots call assembly language, "assembler." It's like calling C, "compiler."

  • (Score: 1, Funny) by Anonymous Coward on Wednesday September 21 2016, @01:38AM

    by Anonymous Coward on Wednesday September 21 2016, @01:38AM (#404614)

    "8088 in college" "redo in kicad". We had 8085's, Z80's, toggle switches, pencil/paper and were damned happy about it!

    Now, get off MY lawn.