I've made no secret that I'd like to bring original content to SoylentNews, and recently polled the community on their feelings for crowdfunding articles. The overall response was somewhat lukewarm mostly on dividing where money and paying authors. As such, taking that into account, I decided to write a series of articles for SN in an attempt to drive more subscriptions and readers to the site, and to scratch a personal itch on doing a retro-computing project. The question then became: What to write?
As part of a conversation on IRC, part of me wondered what a modern day keylogger would have looked running on DOS. In the world of 2016, its no secret that various three letter agencies engage in mass surveillance and cyberwarfare. A keylogger would be part of any basic set of attack tools. The question is what would a potential attack tool have looked like if it was written during the 1980s. Back in 1980, the world was a very different place both from a networking and programming perspective.
For example, in 1988 (the year I was born), the IBM PC/XT and AT would have been a relatively common fixture, and the PS/2 only recently released. Most of the personal computing market ran some version of DOS, networking (which was rare) frequently took the form of Token Ring or ARCNet equipment. Further up the stack, TCP/IP competed with IPX, NetBIOS, and several other protocols for dominance. From the programming side, coding for DOS is very different that any modern platform as you had to deal with Intel's segmented architecture, and interacting directly with both the BIOS, and hardware. As such its an interesting look at how technology has evolved since.
Now obviously, I don't want to release a ready-made attack tool to be abused for the masses especially since DOS is still frequently used in embedded and industry roles. As such, I'm going to target a non-IP based protocol for logging both to explore these technologies, while simultaneously making it as useless as possible. To the extent possible, I will try and keep everything accessible to non-programmers, but this isn't intended as a tutorial for real mode programming. As such I'm not going to go super in-depth in places, but will try to link relevant information. If anyone is confused, post a comment, and I'll answer questions or edit these articles as they go live.
More past the break ...
Back in 1984, IBM released the Personal Computer/AT which can be seen as the common ancestor of all modern PCs. Clone manufacturers copied the basic hardware and software interfaces which made the AT, and created the concept of PC-compatible software. Due to the sheer proliferation of both the AT and its clones, these interfaces became a de-facto standard which continues to this very day. As such, well-written software for the AT can generally be run on modern PCs with a minimum of hassle, and it is completely possible to run ancient versions of DOS and OS/2 on modern hardware due to backwards compatibility.
A typical business PC of the era likely looked something like this:
To put that in perspective, many of today's microcontrollers have on-par or better specifications than the original PC/AT. From a programming perspective, even taking into account resource limitations, coding for the PC/AT is drastically different from many modern systems due to the segmented memory model used by the 8086 and 80286. Before we dive into the nitty-gritty of a basic 'Hello World' program, we need to take a closer look at the programming model and memory architecture used by the 8086 which was a 16-bit processor.
If the AT is the common ancestor of all PC-compatibles, then the Intel 8086 is processor equivalent. The 8086 was a 16-bit processor that operated at a top clock speed of 10 MHz, had a 20-bit address bus that supported up to 1 megabyte of RAM, and provided fourteen registers. Registers are essentially very fast storage locations physically located within the processor that were used to perform various operations. Four registers (AX, BX, CX, and DX) are general purpose, meaning they can be used for any operation. Eight (described below) are dedicated to working with segments, and the final registers are the processor's current instruction pointer (IP), and state (FLAGS) An important point in understanding the differences between modern programming environments and those used by early PCs deals with the difference between 16-bit and 32/64-bit programming. At the most fundamental level, the number of bits a processor has refers to the size of numbers (or integers) it works with internally. As such, the largest possible unsigned number a 16-bit processor can directly work with is 2 to the power of 16 (minus 1) or 65,535. As the name suggests, 32-bit processors work with larger numbers, with the maximum being 4,294,967,296. Thus, a 16-bit processor can only reference up to 64 KiB of memory at a given time while a 32-bit processor can reference up to 4 GiB, and a 64-bit processor can reference up to 16 exbibytes of memory directly.
At this point, you may be asking yourselves, "if a 16-bit processor could only work with 64 KiB RAM directly, how did the the 8086 support up to 1 megabyte?" The answer comes from the segmented memory model. Instead of directly referencing a location in RAM, addresses were divided into two 16-bit parts, the selector and offset. Segments are 64 kilobyte selections of RAM. They could generally be considered the computing equivalent of a postal code, telling the processor where to look for data. The offset then told the processor where exactly within that segment the data it wanted was located. On the 8086, the selector represented the top 16-bits of an address, and then the offset was added to it to create 20-bits (or 1 megabyte) of addressable memory. Segments and offsets are referenced by the processor in special registers; in short you had the following:
As such, memory addresses on the 8086 were written in the form of segment:offset. For example, a given memory address of 0x000FFFFF could be written as F000:FFFF. As a consequence, multiple segment:offset pairs could refer to the same bit of memory; the addresses F555:AAAF, F000:FFFF, and F800:7FFF all refer to the same bit of memory. The segmentation model also had important performance and operational characteristics to consider.
The most important was that since data could be within the same segment, or a different type of segment, you had two different types of pointers to work with them. Near pointers (which is just the 16-bit offset) deal with data within the same segment, and are very fast as no state information has to be changed to reference them. Far pointers pointed to data in a different selector and required multiple operations to work with as you had to not only load and store the two 16-bit components, you had to change the segment registers to the correct values. In practice, that meant far pointers were extremely costly in terms of execution time. The performance hit was bad enough that it eventually lead to one of the greatest (or worst) backward compatibility hacks of all time: the A20 gate, something which I could write a whole article on.
The segmented memory model also meant that any high level programming languages had to incorporate lower-level programming details into it. For example, while C compilers were available for the 8086 (in the form on Microsoft C), the C programming language had to be modified to work with the memory model. This meant that instead of just having the standard C pointer types, you had to deal with near and far pointers, and the layout of data and code within segments to make the whole thing work. This meant that coding for pre-80386 processors required code specifically written for the 8086 and the 80286.
Furthermore, most of the functionality provided by the BIOS and DOS were only available in the form of interrupts. Interrupts are special signals used by the process that something needs immediate attention; for examine, typing a key on a keyboard generates a IRQ 1 interrupt to let DOS and applications know something happened. Interrupts can be generated in software (the 'int' instruction) or hardware. As interrupt handling can generally only be done in raw assembly, many DOS apps of the era were written (in whole or in part) in intel assembly. This brings us to our next topic: the DOS programming model
Before digging more into the subject, let's look at the traditional 'Hello World' program written for DOS. All code posted here is compiled with NASM
; Hello.asm - Hello World section .text org 0x100 _entry: mov ah, 9 mov dx, str_hello int 0x21 ret section .data str_hello: db "Hello World",'$'
Pretty, right? Even for those familiar with 32-bit x86 assembly programming may not be able to understand this at first glance what this does. To prevent this from getting too long, I'm going to gloss over the specifics of how DOS loads programs, and simply what this does. For non-programmers, this may be confusing, but I'll try an explain it below.
The first part of the file has the code segment (marked 'section .text' in NASM) and our program's entry point. With COM files such as this, execution begins at the top of file. As such, _entry is where we enter the program. We immediately execute two 'mov' instructions to load values into the top half of AX (AH), and a near pointer to our string into DX. Ignore 9 for now, we'll get to it in a moment. Afterwords, we trip an interrupt, with the number in hex (0x21) after it being the interrupt we want to trip. DOS's functions are exposed as interrupts on 0x20 to 0x2F; 0x21 is roughly equivalent to stdio in C. 0x21 uses the value in AX to determine which subfunction we want, in this case, 9, to write to console. DOS expects a string terminated in $ in DX; it does not use null-terminated strings like you may expect. After we return from the interrupt, we simply exit the program by calling ret.
Under DOS, there is no standard library with nicely named functions to help you out of the box (though many compilers did ship with these such as Watcom C). Instead, you have to load values into registers, and call the correct interrupt to make anything happen. Fortunately, lists of known interrupts are available to make the process less painful. Furthermore, DOS only provides filesystem and network operations. For anything else, you need to talk to the BIOS or hardware directly. The best way to think of DOS from a programming perspective is essentially an extension of the basic input/output functionality that IBM provided in ROM rather than a full operating system.
We'll dig more into the specifics on future articles, but the takeaway here is that if you want to do anything in DOS, interrupts and reference tables are the only way to do so.
As an introduction article, we looked at the basics of how 16-bit real mode programming works and the DOS programming model. While something of a dry read, it's a necessary foundation to understand the basic building blocks of what is to come. In the next article, we'll look more at the DOS API, and terminate-and-stay resident programs, as well as hooking interrupts.
(Score: 4, Informative) by NCommander on Tuesday August 30 2016, @01:01PM
I didn't bring it up in this post, but video operations on DOS are an interesting beast since you have direct access to video memory. Depending on your hardware, you had the monochrome, CGA, EGA, and then VGA address space above conventional memory.
Assuming you wanted more than what ANSI.SYS could provide, you would have to directly poke that memory to make the magic happen. Sound and other similar stuff would require accessing the proper TSRs and making that magic happen. I don't miss DOS per say, but I do miss the flexibility it provided in a lot of ways. My biggest regret is that the 80286 protected mode flopped in market though; the segmented memory model actually provides natural protection similar to the NX-bit (but better). A stack smash could only destroy the stack segment, and not the program as a whole which drastically kills an entire range of attacks.
While C would still need a flat memory model ((E)CS=(E)DS at a minimium) to act like it should, it would have allowed other programming languages to afford a hell of a lot more security and prevent all sorts of various stupidity.
Still always moving
(Score: 2) by FatPhil on Tuesday August 30 2016, @01:52PM
Are you sure? I think all it really needs is sizeof(void*) = sizeof(void(*)()), so that a void* cast can be reversibly performed. There's no requirement to ever be able to call data or dereference code, so who cares if the segments are different?
Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves
(Score: 0) by Anonymous Coward on Tuesday August 30 2016, @02:15PM
You're probably right, at least most of the time, provided that the compiler handled the segments properly. A more significant issue was passing a stack address and as an argument to a function,and then dereferencing it, which would fail miserably if SS != DS. And they usually weren't for 16-mode DOS programming.
(Score: 2) by NCommander on Tuesday August 30 2016, @02:47PM
Segmented model breaks pointer arithmetic, and makes a lot of things much harder in C. The compiler has to do some epic magic to make it work.
The canonical example is dealing with pointer comparison. Because multiple segment:offset pairs can point to the same logical place, you can't compare two pointers and know they're in the same place without fully evaluating them out to flat memory notation. For example, a pointer to F000:FFFF and F555:AAAF point to the same place, but if you compare them to each other, you would get not equal. The solution is using the third type of pointer, known as huge pointers which normalize pointers to the highest possible segment (which breaks segment aliasing). If a pointer is modified in any way, the pointer has to be recalculated to the huge model to make those comparisons work.
This also causes a lot of pain when dealing with nested arrays if they're in two different segments because you have to have huge pointers to know if they point to the same location. Once again, you have to fix at runtime because you don't know for sure where your data structures will land in memory. (in DOS, this wasn't a big deal since the compiler could assume you had all of conventional memory to play with, since conventional memory is always a 1:1 mapping. 80286 Protected mode threw that out the window since you now had a MMU and hardware based task switching.
Still always moving
(Score: 2) by maxwell demon on Tuesday August 30 2016, @05:51PM
No, it doesn't, it just makes it more complicated to implement. And actually the big problem of the 80286 wasn't really the segmentation, but the fact that segments were only 64 KByte at a time where larger data structures were already reasonable. Also, in protected mode, it was perfectly possible (and reasonable) to make different segments not overlap (thus making a simple segment/offset comparison sufficient).
Note that the only operation C guarantees to work for pointers to unrelated objects is equality comparison. If you limit the maximal object/array size to the maximal segment size, for all other cases you just need to do arithmetic on the offset part. As I already wrote, the problem was that this was 64KB, which was no longer a reasonable limit at that time.
Note that in real mode, equality comparison could be done by just calculating the linear address on the fly. There's no need to normalize all pointers. Nowhere does C require that equal pointers have equal bit patterns.
The Tao of math: The numbers you can count are not the real numbers.
(Score: 2) by maxwell demon on Tuesday August 30 2016, @05:27PM
Actually the C standard doesn't guarantee casting between data pointers and function pointers. And DOS and the various memory models (SMALL, LARGE, HUGE) are probably the reason.
The Tao of math: The numbers you can count are not the real numbers.
(Score: 2) by FatPhil on Tuesday August 30 2016, @10:50PM
Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves
(Score: 2) by maxwell demon on Wednesday August 31 2016, @10:33AM
But some of the DOS memory models had different sizes for code and data pointers. So in casting between the two you might lose the segment part.
The Tao of math: The numbers you can count are not the real numbers.
(Score: 2) by FatPhil on Wednesday August 31 2016, @11:02AM
Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves
(Score: 2) by maxwell demon on Wednesday August 31 2016, @10:40PM
Well, maybe you were only writing SMALL programs. Others were writing MEDIUM or COMPACT programs.
And frankly, apart from some dynamic linking interfaces (which of course don't exist on DOS), I never saw any need to cast between a function pointer and a data pointer.
The Tao of math: The numbers you can count are not the real numbers.
(Score: 2) by FatPhil on Thursday September 01 2016, @07:45AM
Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves
(Score: 2) by Fnord666 on Tuesday August 30 2016, @11:58PM
Assuming you wanted more than what ANSI.SYS could provide, you would have to directly poke that memory to make the magic happen. Sound and other similar stuff would require accessing the proper TSRs and making that magic happen. I don't miss DOS per say, but I do miss the flexibility it provided in a lot of ways.
Ah the good old days of double buffering and page flipping to reduce flicker. Wait, are there people on here who know what a command line compiler looks like?
(Score: 0) by Anonymous Coward on Wednesday August 31 2016, @12:07PM
i cut my teeth on turbo pascal in high school
i'm only a handful of years older than NC
(Score: 2) by NCommander on Thursday September 01 2016, @05:14AM
Command line compiler as in MASM, NASM, or early watcom?
I still sometimes call cl from the command line when I'm testing it, and I suspect most linux developers have invoked GCC by hand too :)
Still always moving