from the adventures-in-data-recovery-and-30-year-old-bugs dept.
One of my favorite hobbies is both retrocomputing projects, and the slightly more arcane field of software archeology; the process of restoring and understanding lost or damaged pieces of history. Quite a while ago, I came across the excellent OS/2 Museum, run by Michal Necasek which helps categorize many of the more obscure bits of PC history, including a series of articles about Xenix, Microsoft’s version of SVR2 UNIX.
What caught my attention were two articles talking about Xenix 386 2.2.3c, a virtually undocumented release that isn’t mentioned in much if any of the Santa Cruz Operation's (SCO, but see footnote) surviving literature of the time. Michal documented [1], [2] his efforts at the time, and ultimately concluded that the media was badly corrupted. Not knowing when to give up, I decided to give it a try and see if anything could be salvaged. As of this writing, and working with Michal, we’ve managed to achieve approximately 98% restoration of the product as it would have existed at the time.
I’m going to write up the rather long and interesting quest of rebuilding this piece of history. I apologize in advance about the images in this article, but we only recently got serial functionality working again, and even then, early boot and installation has to be done over the console.
* - SCO in this case refers to the original Santa Cruz Operation, and not the later SCO Group who bought the name and started the SCO/Linux lawsuits.
Read more past the fold.
Historical Background
From a historical perspective, Xenix is interesting as it was one of the first (if not the first) operating systems to take advantage of Protected Mode on the iAPX 80286 without being hamstrung by lack of backwards compatibility. I’ve talked about the 286 before on SoylentNews, but to summarize, the 80286 was the first processor with Protected Mode. However, it didn’t support paging, and the switch from real mode (8086 compatibility) to protected mode was one way; there was no official way to return to real mode without restarting the processor, and neither DOS, nor BIOS could operate in Protected Mode. To my knowledge, it was the only operating system to adopt the view that a system would enter protected mode, and never return to 16-bit compatibility. As such, it’s implementation of protected mode is somewhat different than most people are familiar with.
Instead, the 80286 was intended to allow running legacy DOS applications in real mode, while people would upgrade to new protected mode operating systems and software. The much loathed real mode segmentation system was revamped as well due to the new 32-bit register size, and it was now possible to have segments up to 16 MiB (a tremendous amount of memory at the time) in size, allowing applications to operate with a de-facto flat memory model.
Correction: Wow, I went wrong here. The 80286's protected mode allowed segments to reside in a 24-bit address space, but were limited to 16-bits (64k) in length. The 80386 changed the rules to allow larger segment size by using additional fields on the LDT and GDT to extend the base, limit and a size modifer.
Additionally, Xenix was one of the most polished, and featureful UNIX systems of it's time. Out of the box, the system was the originator of virtual terminals, and supported both UUCP networking, and RS-232 serial based MicNet, and bridging between the two. MicNet appears to have been Microsoft's answer to AppleTalk as a very low cost networking solution, and allowed multiple systems to appear as one single UUCP node on the bang path. We'll explore both these features in later articles.
For software installation, Xenix's "custom" utility provided full featured package management, installation and removal, and even allowed per-file selection, relatively on par with modern Linux package management. Beside the stock operating system, Xenix had official add-on packages for international support, K&R based C compilers for DOS and Xenix, and a text processing system based on AT&T's troff. Third party solutions provided STREAMS and TCP/IP support before these features were added in Xenix 2.3.
System administration utilities for the most part were interactive, and easy to use, allowing for quick and easy setup of networking, printers, and user administration, and the system could dual boot with DOS. Combined with the visual shell, it's likely one of the best experiences you could get on a UNIX system of the era, and in many ways, still holds up today, nearly 30 years later. Microsoft was pushing Xenix heavily, and for a time, it was intended as the true replacement to the 16-bit DOS. However, fate intervened.
In 1984, following the break-up of Bell System (https://en.wikipedia.org/wiki/Breakup_of_the_Bell_System) into the baby bells, AT&T decided to enter the computer market and directly sell UNIX System V. Microsoft decided that they didn’t want to compete against AT&T, and began to collaborate with IBM to create what would become known as OS/2. In 1987, Microsoft transferred ownership of Xenix to the Santa Cruz Operation, and SCO began porting the operating system to take advantage of the 80386 and creating Xenix 386.
The most commonly known release of Xenix 386 is the 2.3, supported alongside the earlier Xenix 286 2.2 releases, and SCO’s Support Level Supplements simultaneously supported both releases. The SLS index only shows a single update for “Xenix 386 2.2.1-2.2.3” for UUCP, but an examination of that update shows that this appears to be a mislabeling, as the binaries it contained target the 286.
So, what exactly is this unusual 2.2.3c release then? To find that out, I needed to get the thing running.
Stumbling Towards Boot
The images floating around on the internet come in two forms, a set of TeleDisk TD0 images, and a group of raw 720 kilobyte raw images, suitable for use in a VM (or with dd). Much later in our recovery effort, we eventually determined that the TD0 images were the originals, and the raw images were later created from these.
Initially though, I just wanted to get it to start. The image files contained six operating system (known as N1-6) disks, “Basic Utilities” (B1-2) disks, “Extended Utilities” (X1-5) disks, three International Supplement disks, and a single games disk. An initial examination of the disks showed that N1 and N2 had a Xenix filesystem, and the rest were simply raw tar archives that I could extract with GNU tar (with some warnings). The vast majority of data looked intact, so I grabbed QEMU, and popped N1 in and booted it up.
Unfortunately, the system would hang almost immediately after. Some testing revealed that the same issue existed on Bochs. PCjs got a bit further, but kernel panicked nearly immediately. Somewhat surprising to me though was VirtualBox not only booted, it got to the first step of the installer.
Some time later, I did discover the failure here, but I’ll save that story for another article :). */evil*
With the first hurdle passed, it wasn’t long before another problem reared its ugly head (more later). Unfortunately, shortly after that, the system would hang trying to partition /dev/hd0.
Some trial and error showed that if I started the system up without any IDE drives, I could successfully get through to the partitioning screen. As I know Michal had gotten farther in his resurrection attempt, I dropped him an email, and began to dig into the both the boot hang, and the IDE driver, and get a debugging build of VirtualBox setup. As we exchanged emails, I learned Michal had not only found the IDE issue, he also had managed to extract a full set of debugging symbols and offsets, and some tips with using the VirtualBox debugger.
I’ll let him explain in his own words:
Hi Michael,
Here’s my analysis. The wd1010 driver in this version of Xenix is just plain wrong, and they were just lucky that it worked.
The problem is unquestionably with the INITIALIZE PARAMETERS command. The command is automatically executed by the _wdio routine if it finds that it hasn’t been done yet. All the code is in _wdio. It writes all the registers except for the command register. Then it potentially executes a loop which writes the command register and immediately reads the status register. If the error bit is set, the command is written again and the loop repeats until the error register is not set.
What happens in VirtualBox is that reading the status register clears the interrupt triggered by INITIALIZE PARAMETERS. That is the correct behavior, because reading the status register is *supposed* to clear the interrupt. Now at this point the CPU runs with interrupts enabled, but the disk interrupt is masked because the driver executed _spl5 further up the call stack in _wdstrategy. The interrupt is cleared from the device and from the controller, and the OS never receives it.
But the OS relies on the interrupt. It’s supposed to execute _wdintr, notice that INITIALIZE PARAMETERS was executed, set up a RECALIBRATE command into _wdjob and call _wdio again to continue with I/O. Once the interrupt from RECALIBRATE is processed, _wdjob is set up with a read or a write command, _wdio is executed, and the actual I/O happens.
Because the interrupt is cleared too soon, the state machine breaks down and the OS just sits there totally idle because it has nothing to do.
It appears that in old drives, INITIALIZE PARAMETERS [took] some non-negligible time to execute and reading the status register right after writing the command did not clear the interrupt because the command hadn’t yet set it. But then it is wrong to read the status register because if the command is going to fail, it’s probably going to take some time to fail, too.
This would be solved by making INITIALIZE PARAMETERS take a millisecond or two to complete. It is probably much easier to patch Xenix to do what it should have been doing all along, i.e. reading the alternate status register (3F6h instead of 1F7h) which does not clear interrupts.
A 30-Year Old Bug
For those less versed in ATA/IDE interfaces, let me translate this into more basic English. On x86 compatible machines, access to the hard drive is controlled via a dedicated hard-drive controller and managed via the port I/O interface on the process (using in/out opcodes). ATA commands are written to these registers. In this case, Xenix is sending the INITIALIZE PARAMETERS command which brings the drive out of reset, and sets up the addressing mode.
The designers of the ATA specification designed it such a way that I/O operations can be asynchronous; the CPU sends a command, and then goes to do something else. When the hard drive is ready for more, it raises an interrupt, telling the processor to send another command. This interrupt is cleared by reading from the primary status register at 0x1F7. This behavior is by design and has been a part of the ATA specification since day one. In some cases however, one may simply want to poll the drive to know its status without changing interrupt statuses. For this purpose, an alternate status register at 0x3F7 is provided.
Xenix uses lazy initialization; that is to say that a device isn’t initialized until it’s used; the wd driver is never executed until something accesses /dev/hd0, and thus why it hangs at partitioning and not during IPL. When fdisk starts, the wd driver attempts to initialize the drive, and immediately reads the status register to check for any possible error codes. Afterwards, it waits for the IDE controller to generate an interrupt letting it know the drive is ready. In doing so, Xenix clears the interrupt it would get from the INITIALIZE PARAMETERS command, and gets stuck in a spinloop. As such, the hang is caused by a legitimate bug in Xenix in its IDE implementation and can occur on real hardware.
It’s hard to say if this was actually a problem in 1987, however, older releases of Xenix were known to be incredibly picky about the hardware they would work on, and prevailing logic on USENET was that older releases of Xenix would flat out break on any processor faster than 50 Mhz, partially due to bugs like this. However, Xenix 2.3 (which was released not long after this version) rewrote the wd driver to not suffer from this race condition, so it likely was as much a problem then as it was now. As Michal noted, its possible to read the status register without clearing the interrupt, and get the behavior Xenix wants. One quick hex edit later, and I now get this.
Success! Due to the fact that it uses CHS (Cylinder, Head, and Sector) addressing and bypasses the BIOS, Xenix tops out at a maximum drive size of 504 MiB. After a few basic questions, I’m prompted to remove N2, and reboot.
N1 goes back in as per the instructions, I cross my fingers, push Enter and …
It hangs. Crud.
In our next installment, we'll go into trying to manually start the operating system when the only commands we have are tar, mount, dd, and sh, along with the Xenix manifest files, and thereby crash head first into Xenix's copy protection.
Related Stories
This is the third exciting installment of this ongoing saga of restoring Xenix 2.2.3c. When we last left off, we had discovered that it is possible for Xenix to boot with several sectors missing in /etc/init and that the vast majority of the data on the diskettes was intact — to the point that several more steps of the early installation completed.
The story thus far:
- Part 1: Introducing Xenix 2.2.3c
- Part 2: No Tools, No Problem
[Aside: Quite a while ago, I came across the excellent OS/2 Museum, run by Michal Necasek which helps categorize many of the more obscure bits of PC history, including a series of articles about Xenix, Microsoft’s version of SVR2 UNIX.]
If we were to get any further, Michal and I would have to dig deep into the world of teledisk, floppy disk formats, and perform some creative thinking.
Last time — with the help of the excellent Michal Necasek of the OS/2 Museum — we talked about mapping the damage within the existing Xenix 386 disks and successfully got the system to the end of installation.
For those new to the series, I recommend you catch up with the previous three articles:
- Restoring Xenix 386 2.2.3c, Part 1
- Xenix 2.2.3c Restoration: No Tools, No Problem (Part 2)
- Xenix 2.2.3c Restoration: Damage Mapping (Part 3)
Unfortunately, at this point we had exhausted the data we could successfully recover from the TeleDisk images, so now it was time to think laterally in our quest to restore viable installation media. Back in Part 2, I posted the disk headers from each disk indicating what it was and where it was in the set:
./tmp/_lbl/prd=xos/typ=386AT/rel=2.2.3c/vol=N03 ./tmp/_lbl/prd=xos/typ=n86/rel=2.2.2c/vol=B02 ./tmp/_lbl/prd=xos/typ=n86/rel=2.2.2c/vol=X01
And also noted that there was a slight version mismatch (2.2.3 vs 2.2.2). What I didn’t point out was the type was different: n86, vs 386AT; Xenix speak for “generic x86” vs. “386 AT”. As Michal and I discussed it, I realized there was another place we could go to find sectors.
For those who've been long-time readers of SoylentNews, it's not exactly a secret that I have a personal interest in retro computing and documenting the history and evolution of the Personal Computer. About three years ago, I ran a series of articles about restoring Xenix 2.2.3c, and I'm far overdue on writing a new one. For those who do programming work of any sort, you'll also be familiar with "Hello World", the first program most, if not all, programmers write in their careers.
A sample hello world program might look like the following:
#include <stdio.h> int main() { printf("Hello world\n"); return 0; }
Recently, I was inspired to investigate the original HELLO.C for Windows 1.0, a 125 line behemoth that was talked about in hush tones. To that end, I recorded a video on YouTube that provides a look into the world of programming for Windows 1.0, and then testing the backward compatibility of Windows through to Windows 10.
For those less inclined to watch a video, my write-up of the experience is past the fold and an annotated version of the file is available on GitHub
So at it turns out, being cooped up due to COVID-19 causes your local resident NCommander to go on a retro computing spree. Last time, I dug into the nuts and bolts of Hello World for Windows 1.0.. Today, the Delorean is ready to take us to 1993 — the height of the "Network Wars" between Microsoft, Novell Netware, and many other companies competing to control your protocols — to take a closer look at one of Microsoft's offerings: Windows for Workgroups
As with the previous article, there's a YouTube video covering most of the content as well as: a live demonstration of Windows for Workgroups in action, my personal experiences, and the best startup chimes of the early 90s.
If the summary doesn't make you relive all those General Protection Faults, then slip past the fold to see what all the hubbub was about for Windows for Workgroups compared to its younger brother, Windows 3.1.
(Score: 1, Interesting) by Anonymous Coward on Monday March 06 2017, @06:16PM (3 children)
Once upon a time, I took a class (actually 2, I think one was for basic and the other advanced) to become a certified UNIX SA. When the "Windows vs every other OS" wars started to heat up a little later, I always thought it was really ironic that I had a certificate with Bill Gate's signature on it (not handwritten but still...) for anything related to UNIX.
Wish I could find that thing to flesh out the details (messy divorce a long time ago meant I lost a lot of stuff like that.) I think SCO (like you said the "real SCO", not the lawyer team) were the ones who taught it.
(Score: 4, Informative) by NCommander on Monday March 06 2017, @06:26PM
It depends on the era, and the specific version of Xenix. Xenix in and of itself was essentially a Microsoft product, and then licensed to various companies to port to their machines. For example, DOS itself was a Microsoft product, but the source code was licensed to IBM to create PC-DOS. SCO did the vast majority of Xenix ports, but there was also an IBM version of Xenix.
As far as I can find, there's no specific list of copyrights listed in the binaries, and I don't have documentation for this specific version so its hard to tell if this version of Xenix is after Microsoft sold it or not.
Still always moving
(Score: -1, Offtopic) by Anonymous Coward on Monday March 06 2017, @06:37PM (1 child)
Behold the results of people writing (and then saying):
Bill Gates' signature
rather than:
Bill Gates's signature
(Score: 2) by TheRaven on Tuesday March 07 2017, @09:21AM
sudo mod me up
(Score: 5, Interesting) by Azuma Hazuki on Monday March 06 2017, @06:32PM (27 children)
This is so cool! THIS is what we need more of here.
I am "that girl" your mother warned you about...
(Score: 0, Disagree) by Anonymous Coward on Monday March 06 2017, @06:39PM (15 children)
Not only does this seem like a massive waste of time, but your comment is in no way deserving of the "Interesting" modifier.
(Score: 5, Insightful) by Unixnut on Monday March 06 2017, @06:58PM (11 children)
> Not only does this seem like a massive waste of time, but your comment is in no way deserving of the "Interesting" modifier.
In what way is this a waste of time? This is computing history. No different to people who try to preserve ancient artifacts, literature, or (more recently) classic cars for example. Just because the items in this case are virtual, doesn't mean it doesn't hold value to preserve them.
Ok, so I doubt a copy of Xenix will ever result in a multi-million pound sale at Christies auction houses, but that doesn't mean it is a waste. We are building a history of computing here, not unlike ESRs "The Jargon file" which is held online. It is nice for newer generations to see where all the systems they use day to day originated from, especially as I would not be surprised to find that computing will be part of humanity for the forseable future.
I knew of Xenix, I remember seeing the disks when I was a kid (and no idea what it was at the time), and it is nice to see someone get it up and running. Plus it is cool and nerdy what they did. I mean, fiddling with Hex values to fix a 30+ year old bug, that is pretty cool.
While I personally would not do this (just don't have the time), I love reading about this kind of stuff. Not unlike the folks who will hand assemble CPUs from Transistors or ICs, either old architectures (like the PDP's) or their own ISAs. It is hard core geekery, and second the original posters point. We could do with more such stuff on Soylent (indeed this is what Slashdot originally had so much of, which first drew me in to the site).
(Score: -1, Troll) by Anonymous Coward on Monday March 06 2017, @07:11PM (8 children)
I disagree, so therefore I'm a troll; it is my opinion that SN does not need more posts like this, so I'm a troll.
Mod me down, despite the fact that my comment contains as much value as the "Interesting" OP.
(Score: 3, Informative) by weeds on Monday March 06 2017, @07:34PM (3 children)
You'll get modded down because you are a whiney AC complaining about mods instead of talking about the article.
Get money out of politics! [mayday.us]
(Score: -1, Troll) by Anonymous Coward on Monday March 06 2017, @07:40PM (2 children)
That's what's hilarious. Can you feel the pulsating beat of the Hivemind?
(Score: 2) by weeds on Monday March 06 2017, @07:49PM (1 child)
Now you are just trolling. Insult the readers, insult the modders. bye bye.
Get money out of politics! [mayday.us]
(Score: -1, Troll) by Anonymous Coward on Monday March 06 2017, @07:52PM
It's the new way to say "I don't like your face."
(Score: 3, Insightful) by dry on Tuesday March 07 2017, @02:32AM
If you don't like the article, don't click it. I'm not a gamer and find the gaming articles boring. I simply stay away from them rather then posting that they're a waste of time. If you don't find this interesting, stay away. Meanwhile those of us who are interested can read.
And for those of us who like stuff like this, the OS/2 Museum is an excellent site, covering much more then OS/2 though the OS/2 coverage is good as well.
(Score: 2) by NotSanguine on Tuesday March 07 2017, @08:36AM (2 children)
Don't like the topic, don't read about it. That was easy, no?
Is there something you'd like to see here?
Go to the submissions page [soylentnews.org] and submit something you'd like to see and discuss.
Let's talk about what you want to talk about, friend.
Have a wonderful day!
No, no, you're not thinking; you're just being logical. --Niels Bohr
(Score: 3, Insightful) by Joe Desertrat on Tuesday March 07 2017, @10:22AM (1 child)
It's probably the same guy who complains this is supposed to be a tech site when a non-tech submission comes through.
(Score: 2) by NotSanguine on Tuesday March 07 2017, @01:38PM
It's probably the same guy who complains this is supposed to be a tech site when a non-tech submission comes through.
Perhaps you're right. Although this one is more technical than most postings here (and thank goodness, we could use more of these), so perhaps not.
Regardless, it doesn't take much time or energy to encourage folks to submit the stuff they want to see. If this person just needs a kind word to get involved, then it's totally worth it. If he/she really is trolling, nothing much was lost and they are revealed as a poster boy for GIFT [penny-arcade.com].
No, no, you're not thinking; you're just being logical. --Niels Bohr
(Score: 2) by edIII on Monday March 06 2017, @07:24PM
Same here. I didn't even know of Xenix until I got up this morning and started reading SN. This is very cool stuff.
Technically, lunchtime is at any moment. It's just a wave function.
(Score: 1, Funny) by Anonymous Coward on Tuesday March 07 2017, @09:12AM
by Unixnut (5779) Neutral on 2017.03.07 3:58 (#475742)
Username checks out.
:D
(Score: 5, Insightful) by weeds on Monday March 06 2017, @07:32PM (2 children)
At the end of the day, all hobbies are a "waste of time." There isn't any way to justify spending hours playing cards, building airplane models, or reading science fiction. It's a hobby. He likes doing it. Turns out some other people think it is interesting too. You are welcome to your opinion and certainly should voice it. That doesn't make you a troll. If you don't like the mod someone gets, make the huge and complicated leap to come off AC and mod it yourself.
Get money out of politics! [mayday.us]
(Score: 2) by FatPhil on Thursday March 16 2017, @08:40AM (1 child)
Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves
(Score: 2) by weeds on Friday March 17 2017, @06:24PM
Can't argue with that...
Get money out of politics! [mayday.us]
(Score: 3, Insightful) by NCommander on Monday March 06 2017, @06:44PM (10 children)
Appreciated; this has been an interesting personal project, and probably something right up SN's alley, trolls be damned.
Still always moving
(Score: -1, Offtopic) by Anonymous Coward on Monday March 06 2017, @07:15PM (3 children)
I'm just a phantom of your imagination. I'm a bug in the backend code.
(Score: 1) by khallow on Monday March 06 2017, @07:30PM (2 children)
I guess I'm not part of SN
A persecution complex looking for a persecution.
(Score: 0) by Anonymous Coward on Monday March 06 2017, @07:44PM (1 child)
I don't think I need to look for persecution; it has found me.
(Score: 1) by khallow on Tuesday March 07 2017, @03:59PM
Downmodded and called a Troll. Persecution?
No. It's built into the system to downmod trolls, which you are.
I don't think I need to look for persecution; it has found me.
Persecution you can easily avoid by not posting stupid shit to SoylentNews. Here, downmodding anonymous trolling is completely irrelevant: you could make the "persecution" stop merely by changing to less stupid behavior and there's not even a little consequence to the action (downmodding doesn't affect in the least your ability to post stupid shit on SN).
(Score: 1, Interesting) by Anonymous Coward on Monday March 06 2017, @09:13PM (5 children)
i concur; you've inspired me to consider warming up the Wang workstation I have. i have the original diskettes with wang os 1.1, it can't even run dos 2.1 basic programs... or do a wide list of a directory structure (such as dir/w)
(Score: 2) by NCommander on Monday March 06 2017, @10:19PM (4 children)
I've heard of Wang but I'm not really familiar with it beside the name. It would be nice if we can get any disks imaged. Easiest way to do it is probably code a copy of KERMIT in BASIC and copy it out sector by sector over serial. Not a lot of fun, but I've done that to get data out of CPM before using PIP.
Still always moving
(Score: 3, Funny) by driverless on Tuesday March 07 2017, @03:28AM (3 children)
I've heard of Wang but I'm not really familiar with it beside the name.
The main thing you need to know is that Wang cares. Say it out loud several times a day to remind yourself.
(Score: 2) by NCommander on Tuesday March 07 2017, @03:44AM (2 children)
Where the hell is the +?/-? Groan moderation when you need it ...
Still always moving
(Score: 3, Informative) by driverless on Tuesday March 07 2017, @04:00AM (1 child)
It's actually a real thing [vantage.ie], or at least a very widespread urban legend.
(Score: 0) by Anonymous Coward on Wednesday March 08 2017, @02:55PM
There's an apocryphal story about Wang releasing a computer called "King"; I'm not motivated enough to see if that's more than the usual "funny foreigner" story but it appeals to my juvenile sense of humour.
(Score: 4, Funny) by VLM on Monday March 06 2017, @06:47PM (2 children)
It hangs. Crud.
I think there's a patch to systemd that fixes that
(Score: 2) by NCommander on Monday March 06 2017, @06:52PM (1 child)
I see what you did there. In this case, it was actually due to a corrupted binary. We lucked out in a lot of ways trying to get it back together.
Still always moving
(Score: 2) by nobu_the_bard on Monday March 06 2017, @07:06PM
Aaaah spoilers!! :)
(Score: 2) by SomeGuy on Monday March 06 2017, @06:50PM (8 children)
Awesome stuff! Thanks for trying this one out. I had no idea the Xenix 386 SCO Xenix 386 2.2.3 archives out there had such issues. Too often crufty damaged stuff gets archived and nobody ever looks at it, so originals disappear and no one thinks about it.
Also too often the handful of collectors out there that may have originals of such disks are too busy with life to redump or verify such software, or worse yet are afraid to because of copyright reasons.
And as you found out, many archaic bugs can prevent software from running on *exactly* the hardware it was designed for - which becomes more difficult to acquire every day.
It actually does have disk-based copy protection?
The comments about the duplicate sectors on the teledisk images were making me think the disks were physically damaged. Teledisk was designed with the ability to read some forms of floppy copy protection, including duplicate sectors on the same track (applications will re-read the same sector and see that the content changes). But on some occasions it can mis-interpret things if there is damage.
(Score: 3, Interesting) by NCommander on Monday March 06 2017, @07:04PM (7 children)
There is copy protection, but not disk based, or at least, not as far as we can tell. My next steps in the process were attempting to install around the crashing init (at this point I hadn't realized it was corrupted, and assumed we hit another emulation glitch; it had been a theme up until this point). A few of the images are uncorrupted, and on those, we have the expected 9 tracks per side, 1440 sectors, etc. The images are low destiny (720 KiB) 3'5in floppies.
The installer on the other hand explicitly checks if its' installing from 5'15 medium, and does not directly check for 3.5 medium. Basically it checks if a few files are present on /dev/hd00 on the first reboot, and uses that as a sanity check to know if it needs N1 re-inserted to copy the kernel off its filesystem. This it specifically referred to as 135 and 96ds medium. There's some support for media detection in the Xenix kernel (a new feature from the 286 versions), but its slightly hit or miss; it only works reliably on /dev/fd0|/dev/rinstall. I had a misfire when trying to use /dev/fd1 in the pre-boot environment which drove me up a wall for awhile.
What's utterly bizarre with these teledisk images is that a few of the them have *more* than 1440 sectors total, and have the same ID numbers in the teledisk images. i.e., Track 7, Sector 2 would exist twice in the teledisk images. Michal actually tried to write them back to real hardware and succeeded, though obviously didn't get working disks. All the CRCs are present and check out. Notably, none of the other teledisk images that I'm aware of have floppy copy protection, so our best theory is somehow, TeleDisk must have freaked out during imaging. Notably, the corruption happens in *roughly* the same location in the disks that are missing bits, vs. random locations. I was going to write up a fair bit in TeleDisk in the next article.
Still always moving
(Score: 2) by SomeGuy on Monday March 06 2017, @07:37PM (6 children)
I've seen Teledisk flake out in similar ways reading bad disks before.
I'd be interested in hearing what bits you found were missing and what you filled them in with. If this makes disks that are reasonably close to "fixed" I can get the ones on Winworld added to or updated. I'll take a look a the images out there myself when I get a chance.
Ideally if anyone out there still has original media, a redump with a Kryoflux and/or SuperCard pro would be invaluable. Even when there is damage, these will read every last bit possible rather than throwing out hole sectors. Sometimes just running a flux image through a different decoder will have success, and other times a tad of bit shifting and guessing a few bits can fix things.
Also, were you just working from documentation on the internet? With many earlier programs like this getting proper document scans is essential.
(Score: 3, Interesting) by NCommander on Monday March 06 2017, @09:06PM
Xenix 286/386 is surprisingly user friendly; for basic system operation and usage, I never once had to crack open the manual in that regard. Package installation Printing, UUCP, and MicNet are entirely driven by curses like interfaces as was the full install program. It actually beats out quite a few modern distros in this regard. As far as user-friendliness and management, it pretty much destroys DOS and OS/2 of the era, though it lacks online help which is unfortunate. I don't know how it compares to NetWare of the era.
I'll be showing this of in later posts. I did need however to read the vast majority of the Xenix 286 documentation to work out some of the stuff we had to reconstruct, as well as how to do a manual install because of floppy disk issues (next article). As of writing though, we have the installer working again, and the system can be installed as used.
From the TeleDisk images, by comparing where duplicated sectors were and what were we were missing (I wrote a tool to find the missing sectors in the raw, and convert TD mapping to linear), we managed to put together I1, I2, N3, B1, and one of the X disks. The X disks are really unusual, and I don't want to spoil it, but we got really lucky on this regard; same with N2's broken init binary; we also had to explore with Xenix's copy protection to restore the medium to "functional reconstruction" in places as one of the binaries were problematic in this regard. I'll discuss this more in the next article. N1 had a broken manifest which I reconstructed by hand, as well as a broken install script in I3. The International Supplement is extremely rare; neither of us could find another Xenix release with it, nor did Tenox (who has scans of all the documentation).
Worse, the missing sector was at the end of the script and cuts off mid-question so I couldn't even figure out what it was asking. I actually had to go several releases into the future: in SCO OpenWare's documentation to get a reasonable guess at what it was doing and compare it to the remaining code fragments. I think the only thing we didn't recover is a single sector on N5 which is part of adb. That might still be recoverable, but I haven't tried yet.
Still always moving
(Score: 2) by NCommander on Tuesday March 07 2017, @08:08AM (4 children)
As a follow up to this, can you post your experiences with TeleDisk? Might help me and Michal figure out some of the what went wrong part of this.
Still always moving
(Score: 2) by SomeGuy on Tuesday March 07 2017, @03:58PM (3 children)
I'd suggest you ask about this over on the Vintage Computing forum: http://www.vcfed.org/forum/ [vcfed.org] there are lots of people with experience with such over there. The original author Chuck Guiz (Chuck(G) AKA Sydex) posts over there regularly, although he doesn't like to answer questions about it, if I recall correctly he sold off the software to someone else and then sort of dis-owned it for other various reasons too.
Anyway, I believe it operates by using FDC "diagnostic" reads (the same way many of the more complicated copy protection schemes read secret data), which permits viewing the raw MFM or FM decoded data as the FDC sees it. It then analyzes this data manually, rather than letting the FDC do the translation. The upshot is this permits faster analysis of sector layout (you don't have to manually probe for sectors 0-255 in all possible sizes or guess what the interleave is) and can read in some copy protection formats.
The catch though is that a standard FDC can not WRITE back data in the same way as the diagnostic read so you have to go through its standard sector formatting, although a few tricks can be thrown in to duplicate some of the less complicated protections.
The issue you are seeing stems from the fact that A: Tracks on a disk are infinite loops, and B: Teledisk is designed to find and store sectors with duplicate IDs on the same tracks as used by some copy protection. When there is damage, Teledisk (or really your FDC) may read more than one revolution, but since there is missing data, it thinks it is all on one track.
Most people use Dave Dunfield's ImageDisk these days for archiving using a PC and native FDC, it is a similar program but does not support quite all of the copy protection tricks that TeleDisk did, as ImageDisk is designed mainly for archiving non-DOS (Such as TRS-80, TI-99/4A, or CP/M-80) disks on an IBM PC.
However all of this pales in comparison to the use of the SuperCard Pro, Kryoflux, or even the older Central Point TransCopy/Deluxe Option Board. These read and write at the flux level, and therefore can read and write ANYTHING including "damage". Since lobotomized "modern" machines lack real Floppy Disk Controller these days, the SuperCard Pro and Kryoflux are the only real way forward for working with such disks.
(Score: 2) by NCommander on Tuesday March 07 2017, @04:52PM (2 children)
I passed that information along to Michal, thanks. W/ permission, I'll likely quote this in a future article when I go into the teledisk format. I have the next article written up, and it ran long so I didn't get into the data recover part of it yet. Most of the next article deals with manual installation and seeing just how badly the system was damaged.
LGR reviews covered the Kryoflux, and I have got to be say, I really want one, even though I literally don't have any more floppies any more. It's an extremely cool product and I'm glad it exists to make sure things like this don't get lost to the mysts of time.
Still always moving
(Score: 2) by SomeGuy on Tuesday March 07 2017, @06:05PM (1 child)
Sure. Also, there is a writeup comparing different disk archivers that you might find interesting here: https://winworldpc.com/winboards/viewtopic.php?t=7877 [winworldpc.com]
Well, then get some! :)
Really, we need more people archiving this sort of stuff. I can't buy and archive every floppy that comes up on eBay myself.
Also, if you don't mind saying, where did you find the Teledisk images? Looking around off hand all I see are the IMG versions.
(Score: 2) by NCommander on Tuesday March 07 2017, @08:25PM
They can be located with Google searching; I rather not post links publicly, but OPS1.TD0 was the filename. I have the SHA hashs somewhere.
I've lived out of a backpack for more than a few years and got rid almost everything I used to own, and I try not to keep too many things, at least at the moment, I don't have that much to buy old things off eBay.
Still always moving
(Score: 0) by Anonymous Coward on Monday March 06 2017, @06:54PM
Of course we all know the SCO Group of scumbag lawyers, but the Real SCO is under the clutches of Oracle [infogalactic.com]
(Score: 1, Interesting) by Anonymous Coward on Monday March 06 2017, @06:55PM (2 children)
Wow.
Is this the first post with images in it?
I've never seen this on SoylentNews before.
(Score: 2) by NCommander on Monday March 06 2017, @07:07PM (1 child)
I've posted them on other original content before. As I noted, I couldn't use a serial terminal during installation so if I was going to take any representations, it was either retype screens by hand, or post images. Posting images won.
We don't regularly use the feature, and it was hosed for awhile in rehash.
(I have since writing this actually figured out how to get the system to use serial consoles, but I don't think it would work during installation, and I don't have a set of termcap files that let it work properly with xterm windows; it *really* wants a legitimate vt100 terminal, and I get weird escapes and such with screen and minicom on Linux, and PuTTY on Windows)
Still always moving
(Score: 0) by Anonymous Coward on Wednesday March 08 2017, @03:38AM
For future reference: use the full-system debugger to dump 0x800 bytes from physical address 0xb8000 and then discard every other byte. This should get you text from CGA adapters. It normally works for VGA too, though slightly changing the address is possible.
(Score: 0) by Anonymous Coward on Monday March 06 2017, @07:00PM (7 children)
Intel wrote an RTOS called RMX. It was probably running on the 80286 before Xenix, seeing as it came from Intel. You can still buy a copy. It runs the London Underground real-time train control. Lots of binaries are still 16-bit. The kernel obviously switched first; perhaps this is the case with Xenix.
https://en.wikipedia.org/wiki/RMX_(operating_system) [wikipedia.org]
One can run 16-bit binaries on 32-bit hardware. You can run more of them, or have more filesystem cache, if the kernel is aware of the 32-bit hardware. The first enhancement would likely be to set the granularity bit in the segment descriptors for code and data. This allows segments above the old 16-megabyte limit. After that you can allow larger segment limits, allow 32-bit code, and/or enable the page tables.
(Score: 2) by NCommander on Monday March 06 2017, @07:15PM (6 children)
16-bit is a bit complicated in this situation since in the case of Xenix, we're dealing with the 286's version of protected mode, and not the 386's version of it. (Xenix 386 2.2.3 *does*; cr0 byte 32 is set, and a paging table is installed; confirmed by debugger) setup paging, but as far as I can tell, its unused. It also does some checks for known 80386 B0 erreta such as the IMUL stepping bug.
Real mode can essentially be considered a 20-bit architecture because it's a 16-bit base selector, and a 16 bit offset, for 20 bits total, both of are mapped to physical memory addresses. 286 protected mode on the other hand is a 24-bit architecture (16 MiB), due to the limitations of the GDT and LDT; this is broadly similar to the m68k situation on Mac's where while the processor was 32-bit register size, the address lines were 24-bit, and created an ugly world where you could have 32-bit incompatibilities if software did stupid crap like tagged pointers.
It *is* actually possible to run real-mode code unmodified in protected mode, or at least under segmented protected mode as long as the real mode binary followed specific rules. There's an entire section in the iAXP 286 Programmers Reference discussing running legacy code in real mode; OS/2 could do the same thing; 16-bit code running on protected mode. It's not however possible to use DOS, or BIOS calls under this. As such, Xenix 386 maintained backwards compatibility with its PC/XT, AT, and 286 versions.
Still always moving
(Score: 0) by Anonymous Coward on Tuesday March 07 2017, @06:33AM (5 children)
There are reasons to set up paging, even if it mostly isn't used.
For example, RMX for the 80386 sets up paging, but the paging only gets used if you run flat-model 32-bit executables. If you just run segmented executables (either 16-bit or 32-bit) the paging doesn't really get used.
Xenix could be similar: you happen to only have 16-bit segmented executables, but the capability exists to run other types.
Paging can also be used to extend an OS past 16 megabytes of memory without fixing code that would break if the granularity bit were set in the segment descriptors. The OS might internally depend on being able to position segments such that they are not 4096-byte aligned. In fact, this is probable. Many OSes of this era do something truly horrifying: they call into normal 16-bit BIOS functions from 16-bit protected mode. This was sort of workable, given that there were few BIOS vendors (just IBM and Compaq) and one could disassemble the BIOS to see if it would be safe to call. Access to the BIOS data area at 0040:0000 would involve segment 0x40 with a base of 0x400. Use of the NULL segment didn't happen much in those BIOSes, generally happening before entry to protected mode. Use of A20 wrap-around could be handled via faults, via paging (80386 and better), or via actually messing with A20.
(Score: 2) by NCommander on Tuesday March 07 2017, @06:53AM (4 children)
Xenix actually is one of the few operating systems that ran into a forward compatibility issue; Xenix 286 will not run on a 386 due to the fact that it loads bad values into the last two words of the LDT which are normally unused in that processor. That causes bad juju on the 386 which uses that for the increased segment size parameters.
There are a few specific 80386 binaries on the disks, but none appear to use paging capabilities as far as I can tell. I haven't done an in-depth analysis to see if it does actually page those binaries or simply takes advantage of the larger segment size. I don't have a set of development tools that match this version of Xenix, but the Xenix 2.3 386 tools do work on this release, so I could probably try compile a paged binary and see if it works.
Xenix (several versions) also overrides the low memory segment on startup at 0x40, blowing away the EBDA while its at it. This is a hint as to why it refuses to boot on specific emulators and on some machines depending on the BIOS. I haven't checked if it does BIOS calls to gather data before switching to protected mode, or if does some sorta segment magic to do them once leaving real mode. Notably, at least this version of Xenix does NOT setup a V86 task in the TSS to talk to the BIOS (2.3 might).
iAXP 286 programmer's manual actually has an entire section talking about running real mode code within protected mode and how it can safely be done. I won't be surprised if BIOSes of the era were somewhat designed with this use case in mind, even though it was never formally standardized.
Still always moving
(Score: 0) by Anonymous Coward on Tuesday March 07 2017, @08:33AM (3 children)
By "horrible", I mean it. OSes of that era would call the BIOS without setting the V86 bit. They'd just... load some segment and other registers then do a far call, or they'd do a software int, or they'd do an iret. There might even be data segments that are NOT set up as in real mode, and the BIOS is expected to work with them. You might wonder how this interacts with DMA, for example if you ask the BIOS to do floppy IO while the segments are not real-mode compatible. One solution is to make BIOS calls in ring 3 and virtualize the DMA controller by catching the "in" and "out" instructions as faults.
Free emulators have gone through a BIOS change. Most now use seabios, which forked off of the old Bochs BIOS to convert the code to be compatible with gcc and gas. The old Bochs BIOS used bcc and as86, same as the old Linux boot sector. Seabios is a modern PCI BIOS that requires 32-bit instruction support. Seabios requires PCI video, requires that 0xc0000 to 0xfffff be RAM (instead of ROM or ISA MMIO), and seems to have limited support for being called from 16-bit protected mode. The old Bochs BIOS has none of those problems and fits in less space. It can support ISA MMIO for stuff like multi-port serial, SCSI, and telephone voice interfaces. This is not to say that the old Boch BIOS is problem-free. It does not maintain the "alt" key modifier bit in the EBDA. Neither free BIOS supports the Alt-SysRq hooks for OS switching.
BTW, looking to hire people who can understand this kind of stuff
(Score: 1) by kvaltyr on Tuesday March 07 2017, @02:35PM (1 child)
Any further info? Obviously not NCommander here and I'm not super familiar with the specific vagaries of DOS / early x86 processors but I am a competent reverse engineer and have experience with 32/64 bit x86 assembly and some embedded stuff. Would be happy doing anything that isn't web development.
(Score: 0) by Anonymous Coward on Tuesday March 07 2017, @07:24PM
That experience is about perfect. The work varies quite a bit, involving many kinds of CPU and OS from the ancient to the very latest, but is nearly always low-level. We do lots of reverse engineering. We write emulators. We use assembly code. It's USA only, no H1B or greencard. You get extreme flex-time without any expectation of overtime. Skill requirements are kind of vague because the work varies; we hire enough people to cover everything.
So, hmmm, I have a gmail account that I guess is already exposed to spammers. I'm acahalan. Some sort of resume would be good, but we can chat about things first if you like; many people leave valuable low-level things off of their resumes because the web shops don't appreciate it properly.
(Score: 2) by NCommander on Tuesday March 07 2017, @03:56PM
Oh I'm aware of the type of abuse old OSes did. I'm just saying that Xenix 386, being designed for the 386, could have used Virtual 8086 mode to talk to the BIOS if it wanted it. It didn't. VirtualBox uses OpenWatcom to build its BIOS (which is a fork of SeaBios with some duck tape to let it work with some older OSes; I'd probably use wcc16 over bcc just because I get support for slightly more modern C standards).
I'd be interested in hearing what kind of work you're offering, my email is listed publicly, just put a subject that makes sure I don't think its spam (put Xenix or something). I worked professionally on TianoCore on AArch64, and I consider myself at least semi-proficient in real mode x86 though I never worked with it on a professional basis.
Notably, now that I'm actually awake, you could have theoretically made a fully protected mode of the era compliant BIOS either by requiring the base operating system set aside a specific GDT selector(s) for it, or making your entry points only work off CS and fit with near pointers. As a last resort, the BIOS could have always done GDTR, changed the protected mode table via LGDT, and restored it on the way out.
For modern emulation/virtualization, you could always handle some bits of "magic" by simply catch the calls to/from the BIOS, and if in protected mode, and doing magic to make sure the OS gets what it wants.
Still always moving
(Score: 1) by lwr on Monday March 06 2017, @07:47PM (3 children)
One of my first jobs in the mid 1990's was porting a Xenix accounting package to a Windows based replacement. Thank you for the trip down memory lane! It's good to see people with the same interest in this type of history!
(Score: 2) by NCommander on Monday March 06 2017, @08:00PM (2 children)
Let me guess, it was a FoxPro application? :)
Seems Xenix was hugely popular with those. I'm writing up part two now and will likely post it later this week.
Still always moving
(Score: 1) by lwr on Tuesday March 07 2017, @02:24AM (1 child)
How did you guess?
Can't wait to read it!
Keep up the great work!
(Score: 2) by NCommander on Tuesday March 07 2017, @02:46AM
The original FoxPro was available for DOS, Win 3.1, and Mac, and UNIX, and UNIX in this case meant either Xenix or SCO UNIX based on vintage. Of those platforms, Xenix is the only one you'd really want to run a multiuser server application on due to preemptive multitasking, especially since in that era, getting Macs to talk to non-Macs required a significant amount of teeth pulling, AAUI adapters, MacTCP, or AppleTalk for Windows.
A lot of people both then and now never consider the value of having the source code to their line of business applications, so Xenix 2.3.4 is a popular virtualization target for people that can't afford to have them rebuilt. Microsoft Word and ExSCO UnixWare retained Xenix compatibility for years until they finally got sick of paying royalties to Microsoft and dropped it. Some people have managed to get those antiques working via iBCS on Linux, NetBSD and FreeBSD, though support is kinda hit or miss at best.
Still always moving
(Score: 3, Interesting) by NotSanguine on Tuesday March 07 2017, @09:02AM (1 child)
Xenix? That sure takes me back. And the discussions of Real vs. Protected modes are pure gold!
Xenix was also great for crunching numbers with large datasets. It allowed us to ditch nd PDP/11s and (both timeshared and bought used) Vaxen for much lower cost hardware.
Although there are times when I miss ED/EDT [sourceforge.net] (not! But if you, you can get it here [sourceforge.net]).
Those days were quite heady, with lots of hardware-specific knowledge required to do many things on the x86.
This is a great project, I look forward to hearing more about getting Xenix up and running, and the issues you had to resolve to do so.
Out of curiosity, since you were running on a VM, was the VM Debugger [virtualbox.org] up to the task, or did you wish you had a software ICE [wikipedia.org]?
No, no, you're not thinking; you're just being logical. --Niels Bohr
(Score: 2) by NCommander on Tuesday March 07 2017, @04:02PM
VirtualBox's debugger was all around excellent, and it was perfect since I could take the symbol table we had for the Xenix kernel and load it (after converting it to a format that VB could understand) though it took me a good while to get used to it. It's the only modern debugger that I'm aware of that actually knows what to do w/ itself when dealing with multiple protected segments though I had to do a full debugging build of VirtualBox as I needed to add and remove various log statements to figure out what sorta insanity it was doing at a given moment.
Interesting, despite using segmentation and multiple LDTs, it doesn't use the TSS features of the x86. I suppose that makes sense since they probably kept the basic UNIX scheduler instead of going through the trouble of rewriting it to work again the HW TSS.
Still always moving
(Score: 2) by NotSanguine on Tuesday March 07 2017, @09:02AM
Xenix? That sure takes me back. And the discussions of Real vs. Protected modes are pure gold!
Xenix was also great for crunching numbers with large datasets. It allowed us to ditch nd PDP/11s and (both timeshared and bought used) Vaxen for much lower cost hardware.
Although there are times when I miss ED/EDT [sourceforge.net] (not! But if you, you can get it here [sourceforge.net]).
Those days were quite heady, with lots of hardware-specific knowledge required to do many things on the x86.
This is a great project, I look forward to hearing more about getting Xenix up and running, and the issues you had to resolve to do so.
Out of curiosity, since you were running on a VM, was the VM Debugger [virtualbox.org] up to the task, or did you wish you had a software ICE [wikipedia.org]?
No, no, you're not thinking; you're just being logical. --Niels Bohr
(Score: 2) by fishybell on Wednesday March 08 2017, @05:02PM (1 child)
Reading this article after the second one was very interesting. I don't remember it coming up the first time, but this is fantastic.
(Score: 2) by NCommander on Sunday March 12 2017, @02:09AM
Next article goes live on Monday with an interesting plot twist.
Still always moving