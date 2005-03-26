https://www.os2museum.com/wp/dos-memory-management/
The memory management in DOS is simple, but that simplicity may be deceptive. There are several rather interesting pitfalls that programming documentation often does not mention.
DOS 1.x (1981) had no explicit memory management support. It was designed to run primarily on machines with 64K RAM or less, or not too much more (the original PC could not have more than 64K RAM on the system board, although RAM expansion boards did exist). A COM program could easily access (almost) 64K memory when loaded, and many programs didn't rely on even having that much. In fact the early PCs often only had 64K or 48K RAM installed. But the times were rapidly changing.
DOS 2.0 was developed to support the IBM PC/XT (introduced in March 1983), which came with 128K RAM standard, and models with 256K appeared soon enough. Even the older PCs could be upgraded with additional RAM, and DOS needed to have some mechanism to deal with that extra memory.
The DOS memory management was probably written sometime around summer 1982, and it meshed with the newly added process management functions (EXEC/EXIT/WAIT)—allocated memory is owned by the current process, and gets freed when that process terminates. Note that some versions of the memory manager source code (ALLOC.ASM) include a comment that says 'Created: ARR 30 March 1983'. That cannot possibly be true because by the end of March 1983, PC DOS 2.0 was already released, and included the memory management support. The DOS 2.0 memory management functions were already documented in the PC DOS 2.0 manual dated January 1983.
(Score: 2, Funny) by Anonymous Coward on Friday March 06, @07:22AM (5 children)
I forgot that had a meaning before web sites.
(Score: 5, Insightful) by VLM on Friday March 06, @01:32PM (3 children)
A little apples and oranges, the article is all about the back end and .com .exe are more the front end.
IIRC .com files follow the long programming tradition of slap a raw dump of memory in a file and you run it by loading it and jumping to a constant address in 64K mode, like it jumped to 0x0100 or 0x1000 or something like that.
.exe files were a whole little programming language where you tell the loader how to load up the stacks and segment registers and everything before jumping in.
You most certainly could run programs larger than 64K in .com files its just you pretty much had to write a .exe loader of your own and put it in the .com file.
A similar analogy from the old days of linux was the conversion from old a.out files to modern elf files. IIRC elf format just removed all the restrictions and limits from a.out. Still has sections just elf doesn't limit how many and their names, whereas a.out came with a preload. Also you couldn't dynamic link a.out files they had static shared library addresses. Recompile it all if you want to upgrade a library unlike ELF. Of course this makes ELF slower and more complicated and much less secure.
With ELF you can do "funny" stuff like hijack the LD_PRELOAD to dynamically link with hostile code, ha ha very funny. Or you can write ELF headers designed to confuse static analysis software (like multiple overlapping sections that add up to a backdoor but individually are not hostile)
going back to a.out format for containerized software is not a bad idea... would remove entire categories of security holes and make things faster. In my infinite spare time....
(Score: 5, Interesting) by owl on Friday March 06, @06:54PM (2 children)
You didn't need an .exe loader either, provided you were ok with the ".com" method of "slap a raw dump of memory in a file and you run it by loading it and jumping to a constant address in 64K mode". You just needed those "dumps of memory" stored somewhere (one file plus an index, or many files) and just load the memory dumps at suitable unused locations and jump to the "starting address" for each. This was on an 8086/8088 CPU afterall, once your code was executed, it could do anything it wanted, read any memory it chose to, write to anywhere in memory (although doing this to some addresses would create an instant crash situation). So your .com file could certainly load additional memory dumps into other ram locations and then they could all jump among themselves in any way they liked.
The only 64k restriction on .com files was that DOS would only load 64k of memory on your behalf when it launched your .com, and it would only ever jump to the fixed ".com" starting address point.
(Score: 2) by VLM on Saturday March 07, @12:12AM (1 child)
Yeah I think we're agreeing with each other although I could have written a little more clearly that you didn't have to write "the" .exe you just needed something doing the equivalent of the .exe loader tasks to fit in 64K which isn't that hard.
(Score: 2) by owl on Saturday March 07, @02:10AM
Only the original .com file on disk had to fit into 64k. And only then because DOS's com loader only loaded it into a single 64k 8086 segment.
Once the code was executing, it could use all the memory installed in the machine. There were no limitations on the code that was being run once DOS jumped to the entry point of the .com file.
(Score: 2) by Tork on Friday March 06, @08:36PM
(Score: 2) by ledow on Friday March 06, @09:26AM
Didn't know that the MZ signature marked RAM blocks too, I thought that was just the prefix to .EXE files.
(Score: 5, Interesting) by VLM on Friday March 06, @01:21PM (3 children)
I'd disagree having been there, the dos world was total WTF and unix virtual memory model abstracting all that nonsense away was a VERY nice upgrade.
Let's just say that Turbo Basic / Turbo Pascal / Turbo Intercal or something had an entire chapter on the pitfalls of all the possible weird configurations of possible compilation models. Why not run the most versatile model all the time? Size. Takes a lot of code to manipulate four segment registers in concert and if all you need is 64K to run "hello_world.c" thats way smaller faster and easier to use.
For folks who know smaller data bus architectures but never had the "pleasure" of pre-32 bit or pre-64 bit Intel architectures, it was roughly like all memory access could be passed thru up to four (possibly identical, possibly different) segment offset registers which did a shift and add every time the address bus was bit, but differently for different instructions. So your classic PC, even a "640K is enough for anyone) was a 16 bit system with at least four possible bank switching-like windows.
It was ... pretty weird. The idea of moving to a bigger system where your index registers were 32 bits and there's no bank switching nonsense was very luxurious. Wait wait you mean on a modern non-PC system if you want to read address 0x12345678 you just... frigging do it? No intermediate steps of adjusting segment registers? Whoa.
As you'd expect there were like 42 different variations on workarounds. 386 added two additional segment registers seemingly for the sheer hell of it, for example. (start debugger on a new 386 back in the day, display register values, wat the F is a FS register? never seen that before...)
As you'd imagine there were multiple incompatible stacks on TOP of the existing cpu segmentation described above, such as XMS vs EMS memory. Because, you know, why not. Hilariously you COULD have EMM memory on 8-bit machines in theory and I think I had a board that did that (second hand not purchased for $$$$ by me LOL) and XMS couldn't work on anything older than a 286.
PCs were, in some ways, more complicated back in the day than they are now.
So using totally made up numbers to make it simple, lets say you have a C pointer, pointing to somevar at memory model address 123456. Thats actually a pointer to 3456 in 16 bit code, as long as the segment register is loaded with 1200. Now if you change the segment register to 1210 to do something else, then your pointer is still at "model" address 123456 but the 16 bit address is 2456 because 1210 shift shift shift plus 2456 adds up to 123456 same model. Good luck, IRL back in the day this pretty much sucked.
I was playing with this stuff circa 1990. IBM PC hardware sucked compared to real computers which led to lots of memes about "here's a quarter kid buy yourself a better computer". But the PC platform had a LOT of money polishing that turd, which I guess it still does. So it sucked but it was faster than everything that didn't suck. And arguably everything else was better. So you can see the desire to run a "real computer operating system like unix" around 1990 and abstract all the legacy msdos intel stuff away.
(Score: 2) by owl on Friday March 06, @07:16PM (2 children)
To be fair, at the time the i386 would have been a "design project" at Intel (sometime 1980 to 1985) there was still some debate going on in academia as to whether segmentation or paging was better for memory management purposes. And, as the designers, who could possibly could have started designing the i386 before the IBM PC debut, likely had no idea that their new chip, with all its fancy protected mode architecture, would end up being used as a "fast 8086" for years after introduction, might have thought that having two extra segment registers would prove better for the OS when using segmentation for memory management. Processes could have access to a larger number of simultaneous segments they could use, without needing the OS to 'juggle' segment registers for them (remember, in protected mode, ring 3 processes are not allowed to modify segment registers). So they likely had what to them seemed like a good reason to add two more segment registers.
What they would not have been able to accurately see from their "future looking crystal ball" is that by the time their new chip would start having OS'es actually make use of its protected mode features (circa 1992-1995 time frame, ten years after these designers would have been designing it) the world would have settled on "paging" being the one true memory management system and every OS using their '386 protected mode features would go on to effectively disable all the 'segmentation' features and rely on the paging unit exclusively.
(Score: 1, Interesting) by Anonymous Coward on Friday March 06, @08:45PM (1 child)
This is false. In protected mode, every segment register can be loaded during any privilege level. There are even dedicated instructions to load a segment register alongside an accompanying general register in a single instruction (LSS/LDS/LES/LFS/LGS).
(Score: 3, Insightful) by owl on Saturday March 07, @02:19AM
Ah, you are correct, at least partially. Code can't freely load segment registers like it can on an 8086, but provided the privilege levels mesh up properly, at least the data access segment registers can be reloaded in protected mode:
https://prodebug.sourceforge.net/pmtut.html [sourceforge.net]
"Programs can't load a sgement register with just any selector. When a data segment register (DS, ES, FS or GS) is loaded, the 386 checks the DPL against the program's CPL and the selector's RPL. The 386 first compares the CPL to the RPL. The largest one becomes the effective privilege level (EPL). If the DPL is greater than ot equal to the EPL, the 386 loads the sgment register; otherwise an error occurs."
"The SS register must be loaded with a segment whose DPL and CPL are equal. The 386 also checks to make sure a stack segment is readable, writeable and present."
(Score: 2) by mhajicek on Friday March 06, @02:45PM (4 children)
I recall my dad's original IBM PC having 640k RAM.
(Score: 2) by sneftel on Friday March 06, @03:08PM
The original IBM PC only shipped with up to 64k of RAM. Further expansion was possible via add-on cards.
(Score: 2) by bart9h on Friday March 06, @04:07PM (2 children)
Your dad's PC was not the first.
Mine had 256k, and was not the first either.
(Score: 2) by mhajicek on Saturday March 07, @01:49AM (1 child)
It was a 5150, purchased in '81.
(Score: 3, Insightful) by owl on Saturday March 07, @02:21AM
Which, when purchased new from IBM, had either 16K or 64K:
https://en.wikipedia.org/wiki/IBM_Personal_Computer#Hardware [wikipedia.org]
RAM 16 KB or 64 KB minimum (expandable to 640 KB)
So if it eventually had 640k, that would be because your Dad had added one or more expansion cards to expand it out to 640k.
(Score: 2) by turgid on Saturday March 07, @09:47AM (1 child)
Back in the day, Microsoft also had Xenix, "real Unix" for the 286. At one point, they announced that MS-DOS and Xenix would converge. Then OS/2 became the future. Then it was Windows.
(Score: 3, Interesting) by owl on Saturday March 07, @05:14PM
And they did, if ever so slightly. DOS 2.0 added a lot of features that were clearly derived from Unix systems, even if they had to be done via hacks in the DOS world. But that was also where the convergence ended, as DOS really did not continue to add much more Unixisms for the remainder of its lifetime.
(Score: 2) by jman on Saturday March 07, @01:14PM
This is all a little later than 2.0, but: QEMM; Stacker; Squeezing every bit out of that magic area above 640K.
Toward the end of DAS's days, had a 9-track tape drive hooked up to a 386-SX with a whopping 4 megs of ram. Used M$'s "Professional Development System" to compile basic code for sucking data off the tapes.
While the majority of what I was reading was 390 byte records from census data, every now and again we'd get a "stranger" tape & I'd have to figure out its format. Wrote a little app so I could "guess" record sizes and have a few of them display on the amber screen. When stuff lined up, I knew I was in.
Reading a tape (1600bpi, not 6250, we didn't spring for the high-end drive) took about 20 minutes. With IO at 390 bytes per read being the bottleneck, I had the clever idea of using that 4 megs, reading as much as the drive would give me and just parsing everything in memory.
Great, except it wouldn't actually read the amount it was supposed to be able to.
Finally got on the phone with someone at M$ (you could still do that back then), and figured out even though the machine I was on was 32 bit, the driver was 16. So, when I asked for X amount of data, it just couldn't supply it.
Good times!
Oh, 39K? That's what my C64 had left over after it booted. Amazing what you could cram into that tiny thing, and it did suck being stuck with a max of two character variable names. What was it again, sixty lines ago, ax%, az%? Ugh.