Stories
Slash Boxes
Comments

SoylentNews is people

Meta
posted by NCommander on Tuesday July 14 2015, @04:00PM   Printer-friendly
from the replacing-2000s-tech-with-early-80s-tech dept.

Most system administrators working with a large number machines will be at least passingly familiar with LDAP, or it's Microsoft's incarnation as Active Directory. Like most organizations, we used LDAP to organize shell account information for SN's backend servers, and spent the last year and a half cursing because of it. As such, we've recently replaced LDAP with a much older technology known as Hesiod, which is a DNS-based system of storing user accounts and other similar information. Given Hesiod's unique history (and relative obscurity), I though it would be interesting to write a review and detailed history of this relic, as well as go more in-depth why we migrated.

In this novel:

  • Why We Dumped LDAP
  • Project Athena
  • Overview of Hesiod
  • Drawbacks
  • In Closing

Read past the break for a look at this piece of living history.

Why We Dumped LDAP

One of the golden rules of system administration is "if it ain't broke, don't fix it". Given that LDAP is generally considered critical infrastructure for sites that depend on it, its worth spending a few moments explaining why we replaced it. Our LDAP backend was powered by OpenLDAP, which is generally the de facto standard for LDAP servers on Linux. In our experience though, OpenLDAP is extremely difficult to configure due to storing its configuration information within the LDAP tree itself (under cn=config), and being incredibly difficult to examine its current state, as well as recovering from any misconfiguration. In practice, I found it necessary to dump the entire LDAP configuration, modify the raw LDIF files, and then reimport with slapcat, and then pray. Painful, but manageable since, in practice, the overall server configuration shouldn't change frequently.

Unfortunately, every aspect of OpenLDAP has proven to be painful to administer. In keeping with the idea that none of our critical infrastructure should have single points of failure, we established replica servers from our master, and configured client systems to look at the replicas in case the master server take a dive (or is restarting). While a noble idea, we found that frequently without warning or cause, replication would either get out of sync, or simply stop working all together with no useful error messages being logged by slapd. Furthermore, when failover worked, systems would start to lag as nss_ldap kept trying to query the master for 5-10 seconds before switching to the slave for each and every query. As a whole, the entire setup was incredibly brittle.

While many of these issues could be laid at OpenLDAP (vs. LDAP itself as a protocol), other issues compounded to make life miserable. While there are other LDAP implementations such as 389 Directory Server, the simple fact of the matter is that due to schema differences, no two LDAP instances are directly compatible with each other; one can't simply copy the data out of OpenLDAP and import it directly into 389. The issue is further compounded if one is using extended schemas (as we were to store SSH public keys). As such, when slapd started to hang without warning, and without clear indication as of why, the pain got to the point of looking for a replacement rather than keep going with what we were using.

As it turns out, there are relatively few alternatives to LDAP in general, and even fewer supported by most Linux distributions. Out of the box, most Linux distributions can support LDAP, NIS, and Hesiod. Although NIS is still well supported by most Linux distributions, it suffers from security issues, and many same issues with regards to replication and failover. As such, I pushed to replace LDAP with Hesiod, which was originally designed as part of Project Athena.

Project Athena

Hesiod was one of the many systems to originate out of Project Athena, a joint project launched between MIT, DEC, and IBM in the early 80s to create a system of distributed computing across a campus, eventually terminating in 1991. Designed to work across multiple operating systems, and architectures, the original implementation of Athena laid out the following goals:

  • To develop computer-based learning tools that are usable in multiple educational environments
  • Establish a base of knowledge for future decisions about educational computing.
  • Create a computational environment supporting multiple hardware types
  • Encourage the sharing of ideas, code, data, and experience across MIT

As such, work coming from Project Athena was released as free-and-open source software, and provided a major cornerstone in early desktop and networking environments that are commonly in use today such as X Windows, and Kerberos.

As of 2015, 34 years after Athena was started, its underlying technology is still at MIT today, in the form of DebAthena.

Overview of Hesiod

Moving away from the history, and onto the actual technology itself, as indicated above, Hesiod is based in DNS, and takes the form of TXT records (the TXT record type itself was designed for Hesiod, as was the HS class). A sample Hesiod record for a user account looks like this:

mcasadevall.passwd      IN TXT          "mcasadevall:*:2500:2500:Michael Casadevall:/home/mcasadevall:/bin/bash"
2500.uid                IN CNAME        mcasadevall.passwd
mcasadevall.grplist     IN TXT          "sysops:2501:dev_team:2503:prod_access:2504"

For those familiar with the format of /etc/passwd, the format is obvious enough. Out of the box, hesiod supports distributing users and groups, printcap records (for use with LPRng), mount tables, and service locatator records. With minor effort, we were also able to get it to support SSH public keys. Since Hesiod is based on DNS, data can be replicated via normal zone transfers, as well as updated via dynamic DNS updates. Since DNS is not normally enumerable in normal operation, CNAME records are required to allow lookups for ids to be successful.

New types of records can be created by simply adding a new TXT record. For instance, for each user, we encode their SSH public keys as a (username).ssh TXT record. The standard hesinfo can properly query and access these records, making it easy to script:

mcasadevalllithium~$ hesinfo mcasadevall ssh
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA4T3rFl8HondKnGq3+OEAoXzhsZL3YyzRIMCFQeD6aLLHCoVGAwUs3cg7bqUVshGb3udz5Wl/C4ym1aF5Uk5xaZWr2ByKZG6ZPFQb2MZbOG+Lcd5A14gSS2+Hw6+LIoMM8u6CJvIjbTHVI2wbz/ClINDEcJC0bh+YpuaKWyt2iExHATq153ST3dih+sDDK8bq6bFMKM8sdJHl9soKGo7V7i6jIn8E84XmcdTq8Gm2gt6VhOIb/wtr1ix7nxzZ7qCxAQr//FhJ8yVsmHx7wRwkndS7muPfVlVd5jBYPN74AvNicGrQsaPtbkAIwlxOrL92BsS6xtb+sO2iJYHK/EJMoQ== mcasadevall@blacksteel

As such, Hesiod is easy to expand, and provides both command line applications, and the libhesiod API to both query and expand the information Hesiod is able to deliver, and can be deployed to any environment where a sysadmin can control DNS records. As of writing, a set of utilities to integrate and easily manage Hesiod on Amazon EC2's DNS Service (known as Route53) exist in the form of Hesiod53.

Drawbacks to Hesiod

Hesiod inherits several drawbacks due to being based upon DNS. Primarily, it can be affected by various cache poisoning attacks, or hijacking upstream DNS servers. These weaknesses can be mitigated by implementation of DNSCurse or client-side validation of DNSSEC records (standard DNSSEC does not autheticate the "last mile" for DNS queries). Like NIS, if password hashes are stored in Hesiod, they're world-readable, and vulnerable to offline analysis; for this reason, Hesiod should be deployed alongside Kerberos (and pam_krb5) for secure authentication of users and services. At SN, we've been using Kerberos since day 1 for server-to-server communication (and single-sign on for sysadmins), so this was trivial for us. Other organizations may have more difficulty.

Furthermore, under normal circumstances, DNS records can not be enumerated, and nss_hesiod will not provide any records if an application queries for a full list of users (for example, getent passwd on a shell will only return system local users). This may break some utilities who are dependent on getting a full list of users, though in over a month of testing on our development system (lithium), we weren't able to find any sort of breakage.

Finally, although this problem is not inherent to Hesiod, at least on Linux systems, attempts to query users not in /etc/passwd can hang early boot for several minutes. The same issue manifests itself with use of nss_ldap and SSSD. As of writing, we have not determined a satisfactory workaround for the problem, but as our core services are redundant and support automatic failover, a 5-10 minute restart time isn't a serious issue for us.

Finally, although most UNIX and UNIX-likes support Hesiod, there's no support for it on Windows or Mac OS X.

In Closing

Due to its ease of use, we're expectant that Hesiod will drastically reduce the pain of system administration, and removes a service that has proven to be both problematic, and overly complex. While I don't expect a major upswing in Hesiod usage, in practice, it works very well in cloud environments, and for those who find the use of LDAP painful, I highly recommend you experiment in evaluating it as long as one is mindful of it's limitations

I hope you all enjoyed this look at this rather obsecure, but interesting piece of history, and if people are interested, I can be tempted to write more articles of this nature.

~ NCommander

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 4, Insightful) by VLM on Tuesday July 14 2015, @04:47PM

    by VLM (445) on Tuesday July 14 2015, @04:47PM (#208984)

    Out of the box, most Linux distributions can support LDAP, NIS, and Hesiod.

    And files. And unlike 1993 /etc/passwd etc can be distributed by puppet and friends or by various deviant methods involving git repos and copying files. Always available, never more than 15 minutes out of date, no SPOF, pretty straightforward configuration, reintegration of two islands isn't terribly complicated, hard to beat files...

    I remember a conversation with cluster admin awhile ago, pretty much the same complaints about LDAP we all have, they needed and had an extensive puppet system for automation (or whatever they used, maybe cfengine) and they found the least painful way to handle user data was to let the config management system distribute the golden config files to the whole computational cluster.

    Its not like installing LDAP or anything else would let them remove their automation system, and a small handful of files was a rounding error to their config system, and they had no LDAP-ian needs (like able to almost instantly make changes rather than just waiting for the next 15 minute puppet run).

    My favorite openldap complaint is one of the possible setups logs seemingly endless syslog errors about indexes not existing and the LDAP code required to create those weird indexes to shut off the spamflow looks like its summoning Cthulhu or a lesser minion. A LDAP server that speaks LDAP but the backend UI looks and smells like a SQL console would be interesting. Or JSON. Or pretty much anything but LDAP language. Sure talk LDAP language on the wire, but not to the admin victims.

    NIS is fun. I haven't used that since the 90s. All I remember is its server was reliable but when a workstation lost connection to the server it would spam its logs and the screen with endless errors about being unable to talk to ypbind or something like that.

    Another interesting way to look at is is WRT network speeds and automation level and scaling of CPU power and memory and storage and various criteria what made sense for a multinational corporation in '00 might not work anywhere else in 2015. Bye bye ldap I'll miss you about as much as I miss the DAYTIME or ECHO port (remember those?)

    Starting Score:    1  point
    Moderation   +2  
       Insightful=2, Total=2
    Extra 'Insightful' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   4  
  • (Score: 3, Informative) by NCommander on Tuesday July 14 2015, @04:52PM

    by NCommander (2) Subscriber Badge <michael@casadevall.pro> on Tuesday July 14 2015, @04:52PM (#208987) Homepage Journal

    I actually seriously considered this as an alternative, and its been proposed as well when we were looking at nuking LDAP. While I'm not a *huge* fan of copying /etc/{passwd,shadow,group} from a central source, I'd admit it would work for our setup. The biggest problem with doing so is that we're not using the same distro on all boxes (we have a single CentOS box), and some boxes have local accounts for services they're running which means all of these would have to be consolated, and groups would have to be redefined so we could filter them via access.conf. In some places, such as mysql, ID numbers on local boxes won't match so it would require locating all files ofor mysql on all boxes that run it or ndbd, and then chowning them. Not a huge job, but just another step further of complication.

    In addition, if we ended up installing a package on the fly, and it installs a local user, then the master files have to be updated. I like systems that allow us to have both global and local users.

    --
    Still always moving
    • (Score: 2) by goodie on Tuesday July 14 2015, @05:00PM

      by goodie (1877) on Tuesday July 14 2015, @05:00PM (#208992) Journal

      True that. That's why puppet is often used for environments where dozens of cloned boxes are used. In heterogeneous environments, this becomes more complicated and involves some (non trivial) manual fiddling.

      • (Score: 2) by NCommander on Tuesday July 14 2015, @05:10PM

        by NCommander (2) Subscriber Badge <michael@casadevall.pro> on Tuesday July 14 2015, @05:10PM (#208996) Homepage Journal

        I did experiment with puppet ages ago, way back at golive, but I wrote it off as too much work for not enough gain. We don't add new hardware very frequently, and aside from installing KRB5 and Hesiod and the necessary DNS entries, setup isn't overly complex (the most painful part is doing reverse-DNS via IPv6. The record format is just plain ugly, and Kerberos's default setup likes to complain if rDNS is MIA).

        --
        Still always moving
    • (Score: 3, Informative) by ajw on Tuesday July 14 2015, @05:19PM

      by ajw (3140) on Tuesday July 14 2015, @05:19PM (#208998)

      Instead of directly using files, you can use nss_db and distribute the global passwd/group/etc files and create db files from that. This leaves the local flat files untouched and has no problems working on different distros. I have been building the necessary flat files from an LDAP server and distributing those to machines for a very long time to avoid actually using LDAP directly and also avoiding the problems of trying to sync local users for installed packages.

      Also, access.conf does have a host component. I use a single access.conf on all of my machines.

      • (Score: 2) by NCommander on Tuesday July 14 2015, @05:24PM

        by NCommander (2) Subscriber Badge <michael@casadevall.pro> on Tuesday July 14 2015, @05:24PM (#209002) Homepage Journal

        That's a rather novel use of nss_db; most use of the db backend gets these days is mostly to make sure remote clients don't take a dive if they're isolated from LDAP. I didn't even consider that option.

        Ironically, nss_db does NOT work with hesiod because it can't enumerate. If I had any idea where the upstream for Hesiod was these days (I'm guessing its at MIT, but Google hasn't found it), I'd propose extensions. I may very well still do so, and write some tools to make it more practical.

        --
        Still always moving
    • (Score: 2) by VLM on Tuesday July 14 2015, @05:24PM

      by VLM (445) on Tuesday July 14 2015, @05:24PM (#209001)

      Admittedly, the admin with fifty identical rendering boxes in a cluster never had to deal with hetrogenous configurations, he had it easy.

      A couple ideas is I have some puppet manifests that look like

      source => [ "puppet:///modules/wtf/default.$fqdn", "puppet:///modules/wtf/default",],

      And it does what it looks like and the "extra" commas (from memory) aren't actually extra. There are other variables to key off. And class support in Puppet of course. I have puppet classes for db servers, front end servers, dev servers, etc.

      I'll admit to having written some (non relevant to this discussion) code that does hideous things like

      cat header.wtf body.wtf footer.wtf > completefile.wtf

      This would help if there are no overlap issues. So if redhat wants man to be uid 9 and debian wants man to be uid 6, who cares as long as "real users" 1000 and up are the same systemwide. Unless... overlap will be a huge hassle. Maybe whoknows-ix has man as uid 1001, what would be just awesome in the sarcastic sense.

      Configuring all the id numbers the same across a domain does make life a bit easier for good ole NFS, if you ever use that. No more files suddenly owned by "1003" in directory listings or whatever. chmod wtf:wtf wtf on certain machines depends on if wtf is a user there. Too much hassle.

      My disaster recovery is my puppetmaster git repo. Bare metal to working and "custom" configured in less than 10 minutes. Been there done that. Just saying that permitting local changes sounds like a painful system to admin. If it isn't in puppet and I'm not actively putting out a file it doesn't officially exist.

      I've had trouble getting puppet and cfengine working across multiple versions on multiple releases of multiple OS. Its not fun. Leads to swear words and implementing the 1% of puppet features that I actually use in gpg/ssh/bash/git/cron, at least at home where I only got maybe 10-20 boxes. If you think about it, puppet and friends are basically a reimplementation of that along with a megaton of feature lists for stuff I won't use. So when puppet pissed me off too much at home it got replaced with a short shell script that does little more than install/upgrade package copy config file repeat. I don't want or need a bad NIH reimplementation of ssh or gpg or git or bash, I'll just run natively... at least at home where being the only guy who understands what I'm doing it no big deal (there must be 100K users of puppet, however NIH icky it gets)

      A concept I've been fooling around with is given how cheap disk space is, do I really want hetrogenous servers? I mean I wouldn't run mysql or bind for the heck of it unless I was actually using it for security reasons, but if I configured everything to be capable of doing everything, then it could be handy in certain weird troubleshooting situations for any individual machine to temporarily "act as if" they're a complete DEV system. I've never tried this concept but its strangely appealing. Its not likely you'll run out of disk space anytime soon so configure everything to be everything, just don't turn on mysql unless its acting as if its a database server.

      • (Score: 2) by NCommander on Tuesday July 14 2015, @06:36PM

        by NCommander (2) Subscriber Badge <michael@casadevall.pro> on Tuesday July 14 2015, @06:36PM (#209033) Homepage Journal

        Puppet won't really help in our setup because aside from basic kerberos and hesiod setup, every node except the two frontend ones are running different server stuff. Originally we had (near) identical DB backends, but when we consolated nodes to save money, a lot of stuff was moved around and added. I'd love to add more hardware so that isn't the case, but I can't justify the expense. Puppet makes a lot of sense if you have (near) identical machines running the same software like a giant hadoop cluster. If you don't have that, I find it adds too much overhead to really be worthwhile.

        Our disaster recovery plan involves rsyncing filesystems from oxygen (our offsite backup) back to their proper node, restoring the database in case ndbd corrupts it, and crossing our fingers.

        --
        Still always moving
  • (Score: 2) by TheRaven on Wednesday July 15 2015, @11:32AM

    by TheRaven (270) on Wednesday July 15 2015, @11:32AM (#209310) Journal

    I remember a conversation with cluster admin awhile ago, pretty much the same complaints about LDAP we all have, they needed and had an extensive puppet system for automation (or whatever they used, maybe cfengine) and they found the least painful way to handle user data was to let the config management system distribute the golden config files to the whole computational cluster.

    I can see that working for a lot of things, but for password files the end user is going to want to change them periodically on different machines and expect that to be sync'd. How did they handle this? Were they doing full syncs on update, and if so then how did they handle conflicts?

    --
    sudo mod me up
    • (Score: 3, Informative) by VLM on Wednesday July 15 2015, @02:46PM

      by VLM (445) on Wednesday July 15 2015, @02:46PM (#209396)

      Ah OK. Traditionally password files and LDAP and all that were used mostly for login passwords. You know, for keyboard interactive telnet and ssh sessions. Like 90% of the world's login password is "Password" and we need to store that someplace.

      Now a days its all about the kerberos to do actual hand typed password stuff, and between machines its all about ssh shared keys so no password is involved. If my id_rsa.pub is in your authorized_keys I can log in no password required.

      So ironically the passwd file is all about all the demographic data that is everything but passwords. Like the numeric filesystem UID for VLM is 1003 here, or VLM user enjoys /usr/local/bin/bash more so than /bin/dash as his login shell, or /etc/groups says vlm is part of wheel group and sudoers file says wheel group members get to have their way, etc. Hand typed passwords all live in /etc/shadow since the 90s anyways which is another long story.

      The other question you had was update sync and frequency. Well thats pretty much what puppet and friends live to do, all day long, like their primary purpose is someone changes a file on the puppetmaster and within 15 minutes every machine has that new file, where the new file may very well be /etc/passwd or /etc/groups or whatever. There are other less ... forcible ways to have puppet modify passwd and group files than just overwriting files, but KISS principle, here's the golden copy of the password file now copy it everywhere.

      You'd be surprised how rarely someone changes their homedir or login shell so its not like you have to update /etc/passwd very often. Literal hand typed passwords are changed in kerberos. "between two machines" logins are all ssh key based and have nothing to do with /etc/passwd shadow or group beyond like "where is vlm's homedir so I can look at his .ssh/authorized_keys to see if "he" can log in here?" This is an interesting way to break into a non-SSL secured LDAP distributed system BTW (bad guy MITM the LDAP and says uh, sure, just today vlm's homedir is /autofs/badguy/naughty which happens to have bad guy's ssh key info not mine). Another hilarious one is a "special" LDAP response by an attacker that says today VLM's UID is going to be zero, yeah that guy.. So secure your ldap!

      Stereotypical puppet config manager style is you overwrite local changes. That's kind of the point of centralized configuration management. Its considered "good form" or "polite" to put some kind of comment at the top of any file distributed by puppet something like "# This file is distributed by puppet" and if you can chmod it to be read only to kind of drive home the point, then do so. You'll still end up with people experimenting and then wondering why all their manual changes revert at the quarter hour mark.

      I suppose it depends on cluster purpose. One cluster with one user doing one parallel thing (rendering frames of 3d movies or whatever) doesn't have the issues that some kind of institutional general purpose 1000s of different users educational cluster would have.