Stories
Slash Boxes
Comments

SoylentNews is people

Meta
posted by NCommander on Tuesday August 30 2016, @12:14PM   Printer-friendly
from the int-21h-is-how-cool-kids-did-it dept.

I've made no secret that I'd like to bring original content to SoylentNews, and recently polled the community on their feelings for crowdfunding articles. The overall response was somewhat lukewarm mostly on dividing where money and paying authors. As such, taking that into account, I decided to write a series of articles for SN in an attempt to drive more subscriptions and readers to the site, and to scratch a personal itch on doing a retro-computing project. The question then became: What to write?

As part of a conversation on IRC, part of me wondered what a modern day keylogger would have looked running on DOS. In the world of 2016, its no secret that various three letter agencies engage in mass surveillance and cyberwarfare. A keylogger would be part of any basic set of attack tools. The question is what would a potential attack tool have looked like if it was written during the 1980s. Back in 1980, the world was a very different place both from a networking and programming perspective.

For example, in 1988 (the year I was born), the IBM PC/XT and AT would have been a relatively common fixture, and the PS/2 only recently released. Most of the personal computing market ran some version of DOS, networking (which was rare) frequently took the form of Token Ring or ARCNet equipment. Further up the stack, TCP/IP competed with IPX, NetBIOS, and several other protocols for dominance. From the programming side, coding for DOS is very different that any modern platform as you had to deal with Intel's segmented architecture, and interacting directly with both the BIOS, and hardware. As such its an interesting look at how technology has evolved since.

Now obviously, I don't want to release a ready-made attack tool to be abused for the masses especially since DOS is still frequently used in embedded and industry roles. As such, I'm going to target a non-IP based protocol for logging both to explore these technologies, while simultaneously making it as useless as possible. To the extent possible, I will try and keep everything accessible to non-programmers, but this isn't intended as a tutorial for real mode programming. As such I'm not going to go super in-depth in places, but will try to link relevant information. If anyone is confused, post a comment, and I'll answer questions or edit these articles as they go live.

More past the break ...

Looking At Our Target

Back in 1984, IBM released the Personal Computer/AT which can be seen as the common ancestor of all modern PCs. Clone manufacturers copied the basic hardware and software interfaces which made the AT, and created the concept of PC-compatible software. Due to the sheer proliferation of both the AT and its clones, these interfaces became a de-facto standard which continues to this very day. As such, well-written software for the AT can generally be run on modern PCs with a minimum of hassle, and it is completely possible to run ancient versions of DOS and OS/2 on modern hardware due to backwards compatibility.

A typical business PC of the era likely looked something like this:

  • An Intel 8086 or 80286 processor running at 4-6 MHz
  • 256 kilobytes to 1 megabyte of RAM
  • 5-20 MiB HDD + 5.25 floppy disk drive
  • Operating System: DOS 3.x or OS/2 1.x
  • Network: Token Ring connected to a NetWare server, or OS/2 LAN Manager
  • Cost: ~$6000 USD in 1987

To put that in perspective, many of today's microcontrollers have on-par or better specifications than the original PC/AT. From a programming perspective, even taking into account resource limitations, coding for the PC/AT is drastically different from many modern systems due to the segmented memory model used by the 8086 and 80286. Before we dive into the nitty-gritty of a basic 'Hello World' program, we need to take a closer look at the programming model and memory architecture used by the 8086 which was a 16-bit processor.

Real Mode Programming

If the AT is the common ancestor of all PC-compatibles, then the Intel 8086 is processor equivalent. The 8086 was a 16-bit processor that operated at a top clock speed of 10 MHz, had a 20-bit address bus that supported up to 1 megabyte of RAM, and provided fourteen registers. Registers are essentially very fast storage locations physically located within the processor that were used to perform various operations. Four registers (AX, BX, CX, and DX) are general purpose, meaning they can be used for any operation. Eight (described below) are dedicated to working with segments, and the final registers are the processor's current instruction pointer (IP), and state (FLAGS) An important point in understanding the differences between modern programming environments and those used by early PCs deals with the difference between 16-bit and 32/64-bit programming. At the most fundamental level, the number of bits a processor has refers to the size of numbers (or integers) it works with internally. As such, the largest possible unsigned number a 16-bit processor can directly work with is 2 to the power of 16 (minus 1) or 65,535. As the name suggests, 32-bit processors work with larger numbers, with the maximum being 4,294,967,296. Thus, a 16-bit processor can only reference up to 64 KiB of memory at a given time while a 32-bit processor can reference up to 4 GiB, and a 64-bit processor can reference up to 16 exbibytes of memory directly.

At this point, you may be asking yourselves, "if a 16-bit processor could only work with 64 KiB RAM directly, how did the the 8086 support up to 1 megabyte?" The answer comes from the segmented memory model. Instead of directly referencing a location in RAM, addresses were divided into two 16-bit parts, the selector and offset. Segments are 64 kilobyte selections of RAM. They could generally be considered the computing equivalent of a postal code, telling the processor where to look for data. The offset then told the processor where exactly within that segment the data it wanted was located. On the 8086, the selector represented the top 16-bits of an address, and then the offset was added to it to create 20-bits (or 1 megabyte) of addressable memory. Segments and offsets are referenced by the processor in special registers; in short you had the following:

  • Segments
    • CS: Code segment - Application code
    • DS: Data segment - Application data
    • SS: Stack segment - Stack (or working space) location
    • ES: Extra segment - Programmer defined 'spare' segment
  • Offsets
    • SI - Source Index
    • DI - Destination Index
    • BP - Base pointer
    • SP - Stack pointer

As such, memory addresses on the 8086 were written in the form of segment:offset. For example, a given memory address of 0x000FFFFF could be written as F000:FFFF. As a consequence, multiple segment:offset pairs could refer to the same bit of memory; the addresses F555:AAAF, F000:FFFF, and F800:7FFF all refer to the same bit of memory. The segmentation model also had important performance and operational characteristics to consider.

The most important was that since data could be within the same segment, or a different type of segment, you had two different types of pointers to work with them. Near pointers (which is just the 16-bit offset) deal with data within the same segment, and are very fast as no state information has to be changed to reference them. Far pointers pointed to data in a different selector and required multiple operations to work with as you had to not only load and store the two 16-bit components, you had to change the segment registers to the correct values. In practice, that meant far pointers were extremely costly in terms of execution time. The performance hit was bad enough that it eventually lead to one of the greatest (or worst) backward compatibility hacks of all time: the A20 gate, something which I could write a whole article on.

The segmented memory model also meant that any high level programming languages had to incorporate lower-level programming details into it. For example, while C compilers were available for the 8086 (in the form on Microsoft C), the C programming language had to be modified to work with the memory model. This meant that instead of just having the standard C pointer types, you had to deal with near and far pointers, and the layout of data and code within segments to make the whole thing work. This meant that coding for pre-80386 processors required code specifically written for the 8086 and the 80286.

Furthermore, most of the functionality provided by the BIOS and DOS were only available in the form of interrupts. Interrupts are special signals used by the process that something needs immediate attention; for examine, typing a key on a keyboard generates a IRQ 1 interrupt to let DOS and applications know something happened. Interrupts can be generated in software (the 'int' instruction) or hardware. As interrupt handling can generally only be done in raw assembly, many DOS apps of the era were written (in whole or in part) in intel assembly. This brings us to our next topic: the DOS programming model

Disassembling 'Hello World'

Before digging more into the subject, let's look at the traditional 'Hello World' program written for DOS. All code posted here is compiled with NASM

; Hello.asm - Hello World

section .text
org 0x100

_entry:
 mov ah, 9
 mov dx, str_hello
 int 0x21
 ret

section .data
str_hello: db "Hello World",'$'

Pretty, right? Even for those familiar with 32-bit x86 assembly programming may not be able to understand this at first glance what this does. To prevent this from getting too long, I'm going to gloss over the specifics of how DOS loads programs, and simply what this does. For non-programmers, this may be confusing, but I'll try an explain it below.

The first part of the file has the code segment (marked 'section .text' in NASM) and our program's entry point. With COM files such as this, execution begins at the top of file. As such, _entry is where we enter the program. We immediately execute two 'mov' instructions to load values into the top half of AX (AH), and a near pointer to our string into DX. Ignore 9 for now, we'll get to it in a moment. Afterwords, we trip an interrupt, with the number in hex (0x21) after it being the interrupt we want to trip. DOS's functions are exposed as interrupts on 0x20 to 0x2F; 0x21 is roughly equivalent to stdio in C. 0x21 uses the value in AX to determine which subfunction we want, in this case, 9, to write to console. DOS expects a string terminated in $ in DX; it does not use null-terminated strings like you may expect. After we return from the interrupt, we simply exit the program by calling ret.

Under DOS, there is no standard library with nicely named functions to help you out of the box (though many compilers did ship with these such as Watcom C). Instead, you have to load values into registers, and call the correct interrupt to make anything happen. Fortunately, lists of known interrupts are available to make the process less painful. Furthermore, DOS only provides filesystem and network operations. For anything else, you need to talk to the BIOS or hardware directly. The best way to think of DOS from a programming perspective is essentially an extension of the basic input/output functionality that IBM provided in ROM rather than a full operating system.

We'll dig more into the specifics on future articles, but the takeaway here is that if you want to do anything in DOS, interrupts and reference tables are the only way to do so.

Conclusion

As an introduction article, we looked at the basics of how 16-bit real mode programming works and the DOS programming model. While something of a dry read, it's a necessary foundation to understand the basic building blocks of what is to come. In the next article, we'll look more at the DOS API, and terminate-and-stay resident programs, as well as hooking interrupts.

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by NCommander on Tuesday August 30 2016, @02:11PM

    by NCommander (2) Subscriber Badge <mcasadevall@soylentnews.org> on Tuesday August 30 2016, @02:11PM (#395279) Homepage Journal

    That's essentially how UNDELETE worked. DOS used a special character in the FAT to mark a file as deleted. Thus if you walk the FAT, and things haven't been overriden, undeletion is possible. Format basically worked the same way which is why UNFORMAT was a thing.

    As for using the 8086, well, the 8088 in the XT is not fully forward compatible. DOS-compatible software for the XT will work on AT-compatible systems, but anything that talked to the BIOS (aka almost everything) would generally break on the XT->AT jump. I haven't actually decided if I want to try and make this run on a real AT (via emulation), but it might be a nifty challenge, and then do a follow up showing it running on bare metal on some i7 running DOS 6.22 or something. I think I have a i7 with a NE2000 compatible NIC which should at least in theory work. Funny enough, the i7 was the first processor that simply said "eh, fuck it", and locks A20 to on, which means DOS 3.3 won't run in it due to lacking the wrap around. Later DOS versions should be fine though.

    --
    Still always moving
    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 2) by sjames on Tuesday August 30 2016, @08:42PM

    by sjames (2882) on Tuesday August 30 2016, @08:42PM (#395427) Journal

    The AT had an 80286. It also had a few odd workarounds to increase compatibility with the 8088 like an AND gate on the A20 address line so a few programs that depended on addresses wrapping at the 1MB mark would work. Extended memory worked by enabling the A20 line and using dirty segment tricks to access beyond 1MB while still in real mode.

    • (Score: 2) by NCommander on Tuesday August 30 2016, @09:14PM

      by NCommander (2) Subscriber Badge <mcasadevall@soylentnews.org> on Tuesday August 30 2016, @09:14PM (#395438) Homepage Journal

      I must have been dead last night when I wrote story and comments. For some reason I thought the AT had a max of 1 MiB, but wikipedia says it had a max of 16 MiB. I think somewhere between my notes and backslash (admin console) AT and XT got crossed. It's hard to tell based on Google what a typical late 80s AT would have looked like though having 1-2 MiB of RAM probably is in the realm of reasonable.

      Actually looking at the Wikipedia page, an AT with enough RAM could probably have run Windows 3.1 Standard and DOS 5 if you put enough RAM into it. Maybe I'll show off 80286 protected mode if I can think of a reason to enter it; you can bounce back to real mode via the triple-fault check.

      --
      Still always moving
      • (Score: 2) by sjames on Tuesday August 30 2016, @09:43PM

        by sjames (2882) on Tuesday August 30 2016, @09:43PM (#395454) Journal

        IIRC Win 3.1 was a problem on the AT due to the crippled (compared to 386) protected mode. It was also dog slow in protected mode (and so the real mode segment trick to access extended memory).

        In practice, protected mode was avoided as much as possible until the '386 got it right, including the ability to return to real mode without a reset or triple fault.

      • (Score: 2) by dry on Wednesday August 31 2016, @05:51AM

        by dry (223) on Wednesday August 31 2016, @05:51AM (#395589) Journal

        I think that the AT (286) could actually address a GB of virtual memory in protected mode. 32 bit OS/2 limited itself to a GB of address space (512MBs user) so the 16 bit API could address all memory.

        • (Score: 2) by NCommander on Thursday September 01 2016, @01:57AM

          by NCommander (2) Subscriber Badge <mcasadevall@soylentnews.org> on Thursday September 01 2016, @01:57AM (#395985) Homepage Journal

          The 80286 is not at 32-bit processor; it could max address 24 bits of memory, for 16 MiB of total. Intel jumped to 32-bit with the 80386.

          --
          Still always moving
          • (Score: 2) by NCommander on Thursday September 01 2016, @03:02AM

            by NCommander (2) Subscriber Badge <mcasadevall@soylentnews.org> on Thursday September 01 2016, @03:02AM (#396002) Homepage Journal

            Sorry, I should sai not a 32-bit clean processor as it didn't have a 32-bit address bus.

            --
            Still always moving
          • (Score: 2) by dry on Thursday September 01 2016, @03:31AM

            by dry (223) on Thursday September 01 2016, @03:31AM (#396011) Journal

            Address 24 bits of physical memory. In protected mode, the segment selector was more versatile allowing 1GB of virtual memory with each task seeing 16MB max. Play with the GDT and it is possible for one process to access the full GB though not very practical on the 286
            See eg http://nptel.ac.in/courses/Webcourse-contents/IIT-KANPUR/microcontrollers/micro/ui/Course_home4_32.htm [nptel.ac.in]
            To quote the relevant section under Memory Addressing in 80286

            Protected Virtual Addressing Mode (PVAM) - In this we have 1 GByte of virtual memory and 16 Mbyte of physical memory. The address is 24 bit. To enter PVAM mode, Processor Status Word (PSW) is loaded by the instruction LPSW.

            • (Score: 2) by NCommander on Thursday September 01 2016, @04:16AM

              by NCommander (2) Subscriber Badge <mcasadevall@soylentnews.org> on Thursday September 01 2016, @04:16AM (#396028) Homepage Journal

              Looks like you technically correct. The best kind of correct.

              The (un)fortunate truth is outside of Intel programming manuals which is a last resort due to how blasted dry they are, segmented protected mode is basically undocumented. Its close to unheard of that a period specific 80286 would have even hit the 16 MiB limit of RAM, and not a single online resource I've seen talking about the LGDT actually talk about setting up true segments. They basically set a ring 0/3 segment, and call it good. The LDT gets a footnote at best.

              Since the MMU is enabled in protected mode, and W^X is also a thing, you can't call into real mode code assuming it would work in standard protected mode since DOS expected that any address could be RWX even if you limited the GDT to have a ring 0 segment where DOS would expect it. I'd love a chance to play with segmented protected mode in this article, but I can't think of a real world way it could work; on very low memory systems, you don't have anything beyond convential memory and thus the issue is moot. Newer systems might have RAM above 1 MiB, *but* entering protected mode would break standard DOS applications (and EMM386) unless I did some complicated magic to shunt down to real mode, plus making sure it played nice with anything using a DOS extender.

              --
              Still always moving