Slash Boxes

SoylentNews is people

posted by NCommander on Wednesday March 08 2017, @04:15PM   Printer-friendly
from the dd-works-as-a-poor-mans-cp dept.

When we last left off, with the help of the excellent Michal Necasek of the OS/2 Museum, we had gotten the damaged Xenix 2.2.3c past the first hurdle of installation, and directly into a post-reboot crash, the cause of which (at the time) I suspected was another emulation failure.


Needless to say, I needed to get past this. At this point, I have been examining the raw images as best I can, and figuring out how the installer comes together. After a few experiments, I managed to determine a few basic facts about how Xenix is installed when booting from N1/N2:

  • Coming out of reset, the Xenix kernel loads from N1, prompts for the "filesystem floppy" and starts /etc/init or /etc/inir coming out of IPL
    • Init prints the "Entering System Maintenance Mode" line.
    • Inir is used for running fsck, if necessary. Afterwards, it starts init.
  • Initially, the system starts /usr/lib/mkdev/hd which does the following
    • Format the Xenix partition, create slices, and mount the hard drive under /mnt
    • Compile the keyboard configuration files so the language selection sticks past reboot
    • Create a few device nodes manually
    • Cpio is used to copy a list of files from N2 to the root partition.
    • /profile.hd is copied to /mnt/.profile
    • Install a boot sector to the MBR of the hard drive and configure the bootloader for HDD booting
    • The hd script does the initial language selection, followed by fdisk
  • After the completion of the script /.profile is run as the system tries to bring up the root prompt which reboots the system after prompting that N1 should be re-inserted after the boot prompt.

So with knowing what the installer is trying to do, it was time to try and get down and dirty with it.

Have /bin/sh, Will Travel

With a relatively complete understanding of the initial installation steps. I decided to create a boot floppy. By finding the initial strings for language selection, I was able to find where in the boot image the installer starts, and force it to pop open a dedicated shell with a hex editor. With that in place, I finally had a chance to explore the system somewhat. I learned a few interesting details while digging through this. There are references to 96 and 135 tpi media such as the following.

# We want to make the hard disk bootable in the 96 and 135 tpi
# installations so that we don't need to re-insert N1 to re-boot

TPI refers to as "tracks per inch" and is a very old style way of referring to differing types of floppy disk medium. In this specific case, 96 TPI refers to low- (or double-) density 720 KiB 3.5-inch floppy disks, and 135 TPI refers to high-density 1.44 MiB floppies. This suggests that this version of Xenix was available in multiple types of media. This comment would help me immensely in trying to perform a manual install. As it turned out, much to my annoyance, the N2 file system was extremely lean overall. By using "echo *" as a poor-man's ls, I was able to get a list of what I did and didn't have, the /bin directory was rather ... empty.

Lean system

I also found I had /etc/mount and /etc/mknod which helped, but not much overall. Deciding to charge ahead, I ran through the normal partitioning and formatting steps, and then rebooted again with N1, and my modified N2 boot floppy. As I got my hands dirty, I also began to unpack and explore the other disks. As I mentioned before, aside from the first two disks, all the other ones were simply tar archives written as raw files. Or more specifically:

$ file *.img
Basic Utilities 1.img: tar archive
Basic Utilities 2.img: tar archive
Extended Utilities 1.img: tar archive

Each disk begins with a specific header with an empty file which identifies the disk number, product set, and machine set:


As one can plainly see, the B/X disks have a slightly different version, and identify themselves as n86, or generic x86. Furthermore, the N disks are the only ones that have "80386" binaries as defined by their headers. On top of that, investigating N1 I found a master manifest file that lists all the files on all the base installation disks, as well as special files, and mknod numbers. Bingo. Almost all the pieces I needed.

A quick check of the manifest file listings, and the contents of each disk confirmed that despite the differing version numbers, the media in and of itself belonged with each other; that is, these are the disks that correspond to Xenix 386 2.2.3c.

Dealing With Floppy Disk Controllers, Media Controllers, and More

My initial experiments taught me a few things about Xenix, chief of which it very much didn't like its root filesystem floppy removed. If I removed N2 from A: at any point, Bad Thingsā„¢ would happen not long after. As such, if I wanted to successfully bypass the installer and extract things into a working system, I need to figure out how to talk to it.

On UNIX systems, for those less familiar with them, disk operations are handled by special files in the /dev directory, such as /dev/hd0 for the first hard drive, or /dev/fd0 for first floppy drive controller, and so on. In contrast to more modern Linux systems using udev, these nodes exist as a set of static "dummy" files, created via the mknod command — mknod takes four arguments; the file to create, whether the device is binary or character based, and a blank-separated major/minor number that associates it with a driver in the kernel. Combined with the manifest file, it should have been trivial to create /dev/fd1 if it weren't for two simple issues.

  • It didn't work correctly
  • Xenix and read-only root filesystems really hate each other

As far as I can tell, having a read-only root filesystem is a hack that essentially is in place for two things; checking the file system and installation. Under Xenix, when / is mounted read-only, write operations succeed, and for a brief moment, you'll see a file in place and can even interact with it for a time and then it vanishes. Hindsight being 20:20, I could have simply forced / to be mounted read-write, but at the time, the thought didn't occur to me.

Needless to say, this caused all amounts of fun. I eventually realized I could simply mount the root partition at /mnt, and create the device nodes I needed at /mnt/dev, and they would stick around. First hurdle passed!

The floppy issue was a bit more difficult to work out. During installation, the scripts read from the /dev/rinstall device. The manifest also listed /dev/rinstall1 file which also generated errors. The manifest listed several variations.

FD48 b666 bin/bin 3 ./dev/fd1 2/5
FD96 b666 bin/bin 1 ./dev/fd196ds9 2/37
FD96 b666 bin/bin 2 ./dev/fd196ds15 2/53
FD96 b666 bin/bin 1 ./dev/fd196ds18 2/61

In practice, the only node that would work correctly was /dev/(r)fd196ds9, which probably means nothing to most people. Broken down, it's a mode selection for fd1 (B:). 96 refers tracks-per-inch, ds for double sided, and 9 for tracks per side. AKA, mode geometry for low/double density 3.5-inch floppies. Having divined the correct setting, tar could now read the disks:

file listing

Feeding the disks through tar, and manually executing several of the installation steps gave me a reasonable approximation of what the installed system should look like. Testing many of the utilities confirmed my original suspicion that the vast majority of the data was intact. Furthermore, I managed to extract /usr/bin/chroot from the Extended Utilities disk.

To make a long story short, I successfully extracted all the base installation disks, and began to work out the necessary steps to boot from the root file system. The system was extremely unstable in this state, with several utilities causing immediate kernel panics on launch (most annoying, vi did this, forcing me to use ed for almost all file editing). After several attempts, using N1 as a boot floppy, and pointing the root argument to the HD, I got very close to a successful boot.

almost booted

The important line to see here is *** cron started ***, which is one of the final steps listed in /etc/rc before bringing up the login prompt, and a very optimistic step at eventually getting this all working. At this point, I had also learned the existence of the /tmp/init.* files, special shell scripts run during installation. Through these, I managed to learn of the setperms command, which reads the master manifest files on N1 and other disks, and does final tweaking and configuration. I also learned that I needed to do a brand operation on /etc/getty to decrypt the file, and install a serial number in it. With chroot in hand, and fingers crossed, I ran setperms with each manifest, rebooted, and ...

30 Years of Copy Protection

almost booting

Well isn't that an interesting problem? That's the type of message you'd expect if someone detonated a fork bomb on your system.

Another examination of the installation scripts revealed the problem. During installation, three files are personalized with the "brand" utility. In the case of /etc/getty and /usr/sys/lib/libmdep.a, these files are decrypted with a secret derived from the serial number, and activation key. It would also foreshadow the issues we ran into once we began trying to restore the media to near-mint condition. The brand utility is also used to write those values into the kernel binary image.

As I found out as part of debugging, Xenix has unique behavior in handling the validation of serial numbers depending on how it's started. By its nature of being essential boot code, the kernel, by definition, can not be encrypted. As such, the kernel has a runtime check to make sure it has correct information. When started from the hard drive, the kernel reports "Invalid Serial Number" if it gets a mismatched set of keys and subtly degrades behavior.

However, in my case, my frankensteined system was loading its kernel from the the floppy drive. In this case, Xenix suppresses the serial check and prevents the message from displaying, but doesn't prevent the tripwire from being activated.

The tripwire in question is drastically lowering the number of processes that can be run. As it turns out, the limit is reached when the system is brought up in multiuser mode. As I found out (much) later, this behavior is actually documented as a footnote in one of the Xenix 286 manuals. As such, I copied the kernel from N1 to the hard drive, personalized it with brand, and after a reboot ....

And booted


With some more fiddling, I was able to run most of the post installation scripts, and even load the package manager, though it had some corruption issues.


Right about this time, Michal got back to me, and found that the reason the system hangs after reboot; N2 was missing two sectors in /bin/init. I was somewhat in disbelief, so I pulled out dosformat, made a DOS compatible disk, and copied out /etc/init from the booted system.


Sure enough ...

swiss cheese init

Ugh. So my frankensteined system was booting with half of its init binary missing. Awesome. At this point though, I had noticed something interesting on the international supplement, specifically, a /etc/init8 binary, one that had the same file size as the file on N2. When I compared them side by side...

comparsion N2/I1 init binaries

Well isn't that interesting! A comparison of file-sizes show they're identical length, with similar (though not identical) modification dates. As far as I can tell, the only modification appears to be the time-stamp further in the binary. On a hunch, I compared the tail ends of the missing sectors, and they matched. So I simply copied the missing blocks from init8 to init, and then started a fresh new VM. After feeding floppies, this time, instead of the dreaded Z, I got something new.

Insert Serial Number to Continue

It would die shortly afterwards, but now I was on a mission to try and see if I could restore the media to working state. I already proved to myself that enough data existed to at least make a restoration attempt viable. However, to rebuild the media, I needed to characterize the existing damage and find a way to rebuild or replace the missing sectors.

Next time, we dig into the world of teledisk, data reconstruction, and our first steps towards restoring the media.

~ NCommander

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Interesting) by NCommander on Thursday March 09 2017, @08:19AM (2 children)

    by NCommander (2) Subscriber Badge <> on Thursday March 09 2017, @08:19AM (#476908) Homepage Journal

    Well, Novell actually did do pretty much what you described; expect they did it with OS/2: [] I've always been tempted to do *something* with NetWare as a retrocomputing project, but its just not that interesting unless I go dig out some token ring, or coax ethernet and go all in well ... *eh*. From what I know about DCL, it makes a heck of a lot more sense if you're familiar with mainframe systems of the era since as far as I know, it's essentially a mini-computer variant of JCL. The syntax is wonky only if you're expecting UNIX or DOS semantics, but isn't horrid after you get past that initial learning curve. At least if I do OpenVMS, I can talk about DECNet, and maybe even setup the old PATHWORKS stuff on Windows 3.1 to talk to a DEC system.

    I'm not convinced that a port to Linux or a UNIX based platform would have saved NetWare, a dedicated "network" operating system made a lot less sense in a era where pre-emptive multitasking was a thing, and a server could do more than one dedicated operation. IPX was difficult to route in networks that needed to also talk to TCP/IP since you needed switches that could handle both protocols if you did any layer 3 routing, and that PDC and later AD basically did everything NDS did, and could do them both with NetBIOS or over TCP/IP which drastically simplified pain. AppleTalk was the only real survivor of the LAN battles, and that was mostly up until Mac OS X, talking to CIFS file shares was a nightmare out of the box; it was usually far easier to install AppleTalk for Windows 2000 then get classic Mac OS to play ball with Windows file sharing in a domain situation.

    On VMS, it should have taken over the niche currently held onto by IBM pSeries and some of the higher end AIX stuff, and competed well on multi-user shared VPSes in the era before virtualization. If it wasn't for the fact that Itanium hardware is both hideously expensive, and has horrid performance on most generic workloads, I would have probably recommended a VMS solution even in this day to some customers where it was very well suited for (high reliability/availability requirements, combined with extremely "robust" documentation). The current x86_64 platform port of OpenVMS though might put some life back into the system and I might re-evaluate as a choice for folks where standard Linux/*NIX isn't a great fit.

    Unfortunately, DEC went bust, Compaq gutted the corpse, and killed it, Alpha, and Tru64 when it jumped into bed with Intel on the Itantic. Then the POWER house of cards collapsed, and essentially killed any real competition to Intel in almost every market expect the low end, and the very high end.

    Still always moving
    Starting Score:    1  point
    Moderation   +1  
       Interesting=1, Total=1
    Extra 'Interesting' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3  
  • (Score: 2) by NotSanguine on Friday March 10 2017, @12:20AM (1 child)

    I didn't say DCL's syntax was *wonky*, I said DCL was *clunky*. As is JCL. In fact, I pointed out that it was internally consistent and quite powerful. Much more so than just about anything else at the time. Even now it stands up pretty well.

    I think you misunderstood my point of view. I can certainly see applications for VMS today, and it provided (and still does, in some places) quite a lot of value.

    Initially, I just wanted to share what is, now, an amusing anecdote from the era of Xenix, not shit on DEC or VMS.

    DEC tried (with Ultrix) to get into the p-series/AIX space, but Sun, SGI and even IBM ate their lunch pretty effectively. The standardization and economies of scale for ISA/EISA (and then PCI) killed those guys too. Intel was helped along by players like Apple (moving to the x86 platform) and Microsoft (dropping support for alternative architectures).

    I kind of wish there was a more varied ecosystem for microprocessors. I think we'd get a lot more innovation that way.

    I'd forgotten about Netware for OS/2. Given what happened to OS/2, it probably wouldn't have made much of a difference. My thought WRT porting NCP/IPX to Unix had more to do with creating an environment where development could be done on the same platform as production. With the implication that Novell moved to a model which didn't require NLMs, and used standard IPC and sockets to integrate functionality -- making NCP (which could have run on top of IP too) a much better competitor to (Blecch!) LANMAN.

    IIRC, in the early to mid 90s, most routers had both IP and IPX support standard, so I don't think that would have been much of an issue.

    I'd also point out that if it weren't for Phil Karn (KA9Q), Russell Nelson and the PC/TCP packet driver spec, Microsoft might still be using NetBIOS/NetBeui.

    Actually, implementing KA9Q would be a great retro project. I haven't messed with it since 1991, but I think I'll download it and see what I can do.

    No, no, you're not thinking; you're just being logical. --Niels Bohr
    • (Score: 2) by NCommander on Friday March 10 2017, @02:20AM

      by NCommander (2) Subscriber Badge <> on Friday March 10 2017, @02:20AM (#477228) Homepage Journal

      If I ever upgrade my ham ticket to General class, I fully intend to do AX.25 (which KA9Q also supported) over HF, and then port UUCP to work over that type of link for downloading USENET groups over ham, would be nice if I'm backpacking in Africa and have a portable rig I could use to always get connectivity. I used a fairly ghetto UUCP setup to download mail when I lived in China because the worksite, and if I did it now, I'd probably also setup netnews and download a fair bit of GMANE RSS feeds and possibly a group or two, though a lot of USENET seems dead these days; I dusted off my eternal-september account but I can't find any groups that have a pulse.

      Still always moving