Slash Boxes

SoylentNews is people

posted by NCommander on Tuesday July 14 2015, @04:00PM   Printer-friendly [Skip to comment(s)]
from the replacing-2000s-tech-with-early-80s-tech dept.

Most system administrators working with a large number machines will be at least passingly familiar with LDAP, or it's Microsoft's incarnation as Active Directory. Like most organizations, we used LDAP to organize shell account information for SN's backend servers, and spent the last year and a half cursing because of it. As such, we've recently replaced LDAP with a much older technology known as Hesiod, which is a DNS-based system of storing user accounts and other similar information. Given Hesiod's unique history (and relative obscurity), I though it would be interesting to write a review and detailed history of this relic, as well as go more in-depth why we migrated.

In this novel:

  • Why We Dumped LDAP
  • Project Athena
  • Overview of Hesiod
  • Drawbacks
  • In Closing

Read past the break for a look at this piece of living history.

Why We Dumped LDAP

One of the golden rules of system administration is "if it ain't broke, don't fix it". Given that LDAP is generally considered critical infrastructure for sites that depend on it, its worth spending a few moments explaining why we replaced it. Our LDAP backend was powered by OpenLDAP, which is generally the de facto standard for LDAP servers on Linux. In our experience though, OpenLDAP is extremely difficult to configure due to storing its configuration information within the LDAP tree itself (under cn=config), and being incredibly difficult to examine its current state, as well as recovering from any misconfiguration. In practice, I found it necessary to dump the entire LDAP configuration, modify the raw LDIF files, and then reimport with slapcat, and then pray. Painful, but manageable since, in practice, the overall server configuration shouldn't change frequently.

Unfortunately, every aspect of OpenLDAP has proven to be painful to administer. In keeping with the idea that none of our critical infrastructure should have single points of failure, we established replica servers from our master, and configured client systems to look at the replicas in case the master server take a dive (or is restarting). While a noble idea, we found that frequently without warning or cause, replication would either get out of sync, or simply stop working all together with no useful error messages being logged by slapd. Furthermore, when failover worked, systems would start to lag as nss_ldap kept trying to query the master for 5-10 seconds before switching to the slave for each and every query. As a whole, the entire setup was incredibly brittle.

While many of these issues could be laid at OpenLDAP (vs. LDAP itself as a protocol), other issues compounded to make life miserable. While there are other LDAP implementations such as 389 Directory Server, the simple fact of the matter is that due to schema differences, no two LDAP instances are directly compatible with each other; one can't simply copy the data out of OpenLDAP and import it directly into 389. The issue is further compounded if one is using extended schemas (as we were to store SSH public keys). As such, when slapd started to hang without warning, and without clear indication as of why, the pain got to the point of looking for a replacement rather than keep going with what we were using.

As it turns out, there are relatively few alternatives to LDAP in general, and even fewer supported by most Linux distributions. Out of the box, most Linux distributions can support LDAP, NIS, and Hesiod. Although NIS is still well supported by most Linux distributions, it suffers from security issues, and many same issues with regards to replication and failover. As such, I pushed to replace LDAP with Hesiod, which was originally designed as part of Project Athena.

Project Athena

Hesiod was one of the many systems to originate out of Project Athena, a joint project launched between MIT, DEC, and IBM in the early 80s to create a system of distributed computing across a campus, eventually terminating in 1991. Designed to work across multiple operating systems, and architectures, the original implementation of Athena laid out the following goals:

  • To develop computer-based learning tools that are usable in multiple educational environments
  • Establish a base of knowledge for future decisions about educational computing.
  • Create a computational environment supporting multiple hardware types
  • Encourage the sharing of ideas, code, data, and experience across MIT

As such, work coming from Project Athena was released as free-and-open source software, and provided a major cornerstone in early desktop and networking environments that are commonly in use today such as X Windows, and Kerberos.

As of 2015, 34 years after Athena was started, its underlying technology is still at MIT today, in the form of DebAthena.

Overview of Hesiod

Moving away from the history, and onto the actual technology itself, as indicated above, Hesiod is based in DNS, and takes the form of TXT records (the TXT record type itself was designed for Hesiod, as was the HS class). A sample Hesiod record for a user account looks like this:

mcasadevall.passwd      IN TXT          "mcasadevall:*:2500:2500:Michael Casadevall:/home/mcasadevall:/bin/bash"
2500.uid                IN CNAME        mcasadevall.passwd
mcasadevall.grplist     IN TXT          "sysops:2501:dev_team:2503:prod_access:2504"

For those familiar with the format of /etc/passwd, the format is obvious enough. Out of the box, hesiod supports distributing users and groups, printcap records (for use with LPRng), mount tables, and service locatator records. With minor effort, we were also able to get it to support SSH public keys. Since Hesiod is based on DNS, data can be replicated via normal zone transfers, as well as updated via dynamic DNS updates. Since DNS is not normally enumerable in normal operation, CNAME records are required to allow lookups for ids to be successful.

New types of records can be created by simply adding a new TXT record. For instance, for each user, we encode their SSH public keys as a (username).ssh TXT record. The standard hesinfo can properly query and access these records, making it easy to script:

mcasadevall@lithium:~$ hesinfo mcasadevall ssh
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA4T3rFl8HondKnGq3+OEAoXzhsZL3YyzRIMCFQeD6aLLHCoVGAwUs3cg7bqUVshGb3udz5Wl/C4ym1aF5Uk5xaZWr2ByKZG6ZPFQb2MZbOG+Lcd5A14gSS2+Hw6+LIoMM8u6CJvIjbTHVI2wbz/ClINDEcJC0bh+YpuaKWyt2iExHATq153ST3dih+sDDK8bq6bFMKM8sdJHl9soKGo7V7i6jIn8E84XmcdTq8Gm2gt6VhOIb/wtr1ix7nxzZ7qCxAQr//FhJ8yVsmHx7wRwkndS7muPfVlVd5jBYPN74AvNicGrQsaPtbkAIwlxOrL92BsS6xtb+sO2iJYHK/EJMoQ== mcasadevall@blacksteel

As such, Hesiod is easy to expand, and provides both command line applications, and the libhesiod API to both query and expand the information Hesiod is able to deliver, and can be deployed to any environment where a sysadmin can control DNS records. As of writing, a set of utilities to integrate and easily manage Hesiod on Amazon EC2's DNS Service (known as Route53) exist in the form of Hesiod53.

Drawbacks to Hesiod

Hesiod inherits several drawbacks due to being based upon DNS. Primarily, it can be affected by various cache poisoning attacks, or hijacking upstream DNS servers. These weaknesses can be mitigated by implementation of DNSCurse or client-side validation of DNSSEC records (standard DNSSEC does not autheticate the "last mile" for DNS queries). Like NIS, if password hashes are stored in Hesiod, they're world-readable, and vulnerable to offline analysis; for this reason, Hesiod should be deployed alongside Kerberos (and pam_krb5) for secure authentication of users and services. At SN, we've been using Kerberos since day 1 for server-to-server communication (and single-sign on for sysadmins), so this was trivial for us. Other organizations may have more difficulty.

Furthermore, under normal circumstances, DNS records can not be enumerated, and nss_hesiod will not provide any records if an application queries for a full list of users (for example, getent passwd on a shell will only return system local users). This may break some utilities who are dependent on getting a full list of users, though in over a month of testing on our development system (lithium), we weren't able to find any sort of breakage.

Finally, although this problem is not inherent to Hesiod, at least on Linux systems, attempts to query users not in /etc/passwd can hang early boot for several minutes. The same issue manifests itself with use of nss_ldap and SSSD. As of writing, we have not determined a satisfactory workaround for the problem, but as our core services are redundant and support automatic failover, a 5-10 minute restart time isn't a serious issue for us.

Finally, although most UNIX and UNIX-likes support Hesiod, there's no support for it on Windows or Mac OS X.

In Closing

Due to its ease of use, we're expectant that Hesiod will drastically reduce the pain of system administration, and removes a service that has proven to be both problematic, and overly complex. While I don't expect a major upswing in Hesiod usage, in practice, it works very well in cloud environments, and for those who find the use of LDAP painful, I highly recommend you experiment in evaluating it as long as one is mindful of it's limitations

I hope you all enjoyed this look at this rather obsecure, but interesting piece of history, and if people are interested, I can be tempted to write more articles of this nature.

~ NCommander

Related Stories

RFC: Crowdfunding Articles 44 comments

So, during the last site update article, a discussion came up talking about how those who work and write for this site should get paid for said work. I've always wanted to get us to the point where we could cut a check to the contributors of SoylentNews, but as it stands, subscriptions more or less let us keep the lights on and that's about it.

As I was writing and responding to one specific thread, part of me started to wonder if there would be enough interest to try and crowdfund articles on specific topics. In general, meta articles in which we talk deploying HSTS or our use of Hesiod tend to generate a lot of interest. So, I wanted to try and see if there was an opportunity to both generate interesting content, and help get some funds back to those who donate their time to keep the lights on.

One idea that immediately comes to mind that I could write is deploying DNSSEC in the real world, and an active example of how it can help mitigate hijack attacks against misconfigured domains. Alternatively, on a retro-computing angle, I could cook something in 16-bit real mode assembly that can load an article from I could also do a series on doing (mostly) bare metal work; i.e., loading an article from PXE boot or UEFI.

However, before I get in too deep into building this idea, I want to see how the community feels about it. My initial thought is that the funds raised for a given article would dictate how long it would be, and the revenue would be split between the author, and the staff, with the staff section being divided at the end of the year as even as possible. The program would be open to any SN contributor. If the community is both interested and willing, I'll organize a staff meeting and we'll do a trial run to see if the idea is viable. If it flies, then we'll build out the system to be a semi-regular feature of the site

As always, leave your comments below, and we'll all be reading ...

~ NCommander

Community Reviews: My Time as an ICANN Fellow 12 comments

Disclaimer: This post does not reflect the views or policies of SoylentNews Public Benefit Corporation (SN PBC), its staff, or my role as president. The opinions and statements within are my own, Michael Casadevall, and neither I nor SN PBC were financially compensated for this post.

There are times in life where you simply don't know where you will end up. For me, a chance encounter in Puerto Rico lead to a rather interesting series of events. I have spent the previous week (October 20th-26th) attending the ICANN 63rd International Public Meeting. For those who aren't familiar with the Internet Corporation for Assigned Names and Numbers (ICANN), it is essentially the not-for-profit organization that administrates the Internet root zone which forms the linchpin of the modern internet, and allows domain names such as to exist.

As a fellow, I have been working to help advance policy from the perspective of Internet end-users, as well as improving access to the Internet in the form of Internationalized Domain Names. For those less familiar with the technical underpinnings of the Internet, I'll also talk a bit about DNS, and more of the work I am currently in the process of handling at ICANN.

In This Issue

  • DNS - What is it?
  • The Internet Root Zone, Top Level Domains, and Second Level Domains
  • What Is ICANN?
  • The Fellowship Program
  • New Generic Top Level Domains
  • String Contention and Name Collisions
  • Internationalized Domain Names
  • In Closing
  • If You Want To Get Involved
  • Acknowledgements

Read more past the fold ...

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by tibman on Tuesday July 14 2015, @04:27PM

    by tibman (134) Subscriber Badge on Tuesday July 14 2015, @04:27PM (#208973)

    Dupe and I read this on slashdot three days ago, geez guys : P

    It's great that NCommander and friends continue to not only improve the visible portions of SN but the background stuff too. Far too many projects leave it at "good enough" and switch to another project entirely. But it seems like this kind of polishing is what makes something last a decade past its competitors. You can also survive periods of no maintenance should anything bizarre happen (like the company pays for a trip to Burning Man with all this venture capital falling on them).

    SN won't survive on lurkers alone. Write comments.
    • (Score: 3, Insightful) by NCommander on Tuesday July 14 2015, @04:38PM

      by NCommander (2) Subscriber Badge <> on Tuesday July 14 2015, @04:38PM (#208977) Homepage Journal

      I honestly can't tell if this is a backhand insult or not on the "good enough" line.

      Generally speaking, we do what we can to improve SN as much as we can, either by redoing parts of the infrastructure that are causing us pain, or experimenting with new or different types of articles. I like to be forefront on stuff we're doing on this site, though on this occasion, I was more interested to write about Hesiod than documenting the change for the public. We do quite a bit of stuff on the backend that goes without reporting such as moving services box to box, and other routine sysops work.

      Still always moving
      • (Score: 4, Informative) by tibman on Tuesday July 14 2015, @04:54PM

        by tibman (134) Subscriber Badge on Tuesday July 14 2015, @04:54PM (#208988)

        No insult in there, I apologize for being unclear : ) I was saying it is good that you continue to improve the back-end.

        SN won't survive on lurkers alone. Write comments.
        • (Score: 0) by Anonymous Coward on Wednesday July 15 2015, @02:51AM

          by Anonymous Coward on Wednesday July 15 2015, @02:51AM (#209182)

          That was what I got up front - I re-read it looking for any kind of negativity about the project in your post, couldn't see any.

  • (Score: 2) by Nerdfest on Tuesday July 14 2015, @04:39PM

    by Nerdfest (80) on Tuesday July 14 2015, @04:39PM (#208979)

    This is great, as I wasn't even aware this existed. Having found LDAP a pain to set up, I may also give this a try.

  • (Score: 4, Insightful) by VLM on Tuesday July 14 2015, @04:47PM

    by VLM (445) Subscriber Badge on Tuesday July 14 2015, @04:47PM (#208984)

    Out of the box, most Linux distributions can support LDAP, NIS, and Hesiod.

    And files. And unlike 1993 /etc/passwd etc can be distributed by puppet and friends or by various deviant methods involving git repos and copying files. Always available, never more than 15 minutes out of date, no SPOF, pretty straightforward configuration, reintegration of two islands isn't terribly complicated, hard to beat files...

    I remember a conversation with cluster admin awhile ago, pretty much the same complaints about LDAP we all have, they needed and had an extensive puppet system for automation (or whatever they used, maybe cfengine) and they found the least painful way to handle user data was to let the config management system distribute the golden config files to the whole computational cluster.

    Its not like installing LDAP or anything else would let them remove their automation system, and a small handful of files was a rounding error to their config system, and they had no LDAP-ian needs (like able to almost instantly make changes rather than just waiting for the next 15 minute puppet run).

    My favorite openldap complaint is one of the possible setups logs seemingly endless syslog errors about indexes not existing and the LDAP code required to create those weird indexes to shut off the spamflow looks like its summoning Cthulhu or a lesser minion. A LDAP server that speaks LDAP but the backend UI looks and smells like a SQL console would be interesting. Or JSON. Or pretty much anything but LDAP language. Sure talk LDAP language on the wire, but not to the admin victims.

    NIS is fun. I haven't used that since the 90s. All I remember is its server was reliable but when a workstation lost connection to the server it would spam its logs and the screen with endless errors about being unable to talk to ypbind or something like that.

    Another interesting way to look at is is WRT network speeds and automation level and scaling of CPU power and memory and storage and various criteria what made sense for a multinational corporation in '00 might not work anywhere else in 2015. Bye bye ldap I'll miss you about as much as I miss the DAYTIME or ECHO port (remember those?)

    • (Score: 3, Informative) by NCommander on Tuesday July 14 2015, @04:52PM

      by NCommander (2) Subscriber Badge <> on Tuesday July 14 2015, @04:52PM (#208987) Homepage Journal

      I actually seriously considered this as an alternative, and its been proposed as well when we were looking at nuking LDAP. While I'm not a *huge* fan of copying /etc/{passwd,shadow,group} from a central source, I'd admit it would work for our setup. The biggest problem with doing so is that we're not using the same distro on all boxes (we have a single CentOS box), and some boxes have local accounts for services they're running which means all of these would have to be consolated, and groups would have to be redefined so we could filter them via access.conf. In some places, such as mysql, ID numbers on local boxes won't match so it would require locating all files ofor mysql on all boxes that run it or ndbd, and then chowning them. Not a huge job, but just another step further of complication.

      In addition, if we ended up installing a package on the fly, and it installs a local user, then the master files have to be updated. I like systems that allow us to have both global and local users.

      Still always moving
      • (Score: 2) by goodie on Tuesday July 14 2015, @05:00PM

        by goodie (1877) on Tuesday July 14 2015, @05:00PM (#208992) Journal

        True that. That's why puppet is often used for environments where dozens of cloned boxes are used. In heterogeneous environments, this becomes more complicated and involves some (non trivial) manual fiddling.

        • (Score: 2) by NCommander on Tuesday July 14 2015, @05:10PM

          by NCommander (2) Subscriber Badge <> on Tuesday July 14 2015, @05:10PM (#208996) Homepage Journal

          I did experiment with puppet ages ago, way back at golive, but I wrote it off as too much work for not enough gain. We don't add new hardware very frequently, and aside from installing KRB5 and Hesiod and the necessary DNS entries, setup isn't overly complex (the most painful part is doing reverse-DNS via IPv6. The record format is just plain ugly, and Kerberos's default setup likes to complain if rDNS is MIA).

          Still always moving
      • (Score: 3, Informative) by ajw on Tuesday July 14 2015, @05:19PM

        by ajw (3140) on Tuesday July 14 2015, @05:19PM (#208998)

        Instead of directly using files, you can use nss_db and distribute the global passwd/group/etc files and create db files from that. This leaves the local flat files untouched and has no problems working on different distros. I have been building the necessary flat files from an LDAP server and distributing those to machines for a very long time to avoid actually using LDAP directly and also avoiding the problems of trying to sync local users for installed packages.

        Also, access.conf does have a host component. I use a single access.conf on all of my machines.

        • (Score: 2) by NCommander on Tuesday July 14 2015, @05:24PM

          by NCommander (2) Subscriber Badge <> on Tuesday July 14 2015, @05:24PM (#209002) Homepage Journal

          That's a rather novel use of nss_db; most use of the db backend gets these days is mostly to make sure remote clients don't take a dive if they're isolated from LDAP. I didn't even consider that option.

          Ironically, nss_db does NOT work with hesiod because it can't enumerate. If I had any idea where the upstream for Hesiod was these days (I'm guessing its at MIT, but Google hasn't found it), I'd propose extensions. I may very well still do so, and write some tools to make it more practical.

          Still always moving
      • (Score: 2) by VLM on Tuesday July 14 2015, @05:24PM

        by VLM (445) Subscriber Badge on Tuesday July 14 2015, @05:24PM (#209001)

        Admittedly, the admin with fifty identical rendering boxes in a cluster never had to deal with hetrogenous configurations, he had it easy.

        A couple ideas is I have some puppet manifests that look like

        source => [ "puppet:///modules/wtf/default.$fqdn", "puppet:///modules/wtf/default",],

        And it does what it looks like and the "extra" commas (from memory) aren't actually extra. There are other variables to key off. And class support in Puppet of course. I have puppet classes for db servers, front end servers, dev servers, etc.

        I'll admit to having written some (non relevant to this discussion) code that does hideous things like

        cat >

        This would help if there are no overlap issues. So if redhat wants man to be uid 9 and debian wants man to be uid 6, who cares as long as "real users" 1000 and up are the same systemwide. Unless... overlap will be a huge hassle. Maybe whoknows-ix has man as uid 1001, what would be just awesome in the sarcastic sense.

        Configuring all the id numbers the same across a domain does make life a bit easier for good ole NFS, if you ever use that. No more files suddenly owned by "1003" in directory listings or whatever. chmod wtf:wtf wtf on certain machines depends on if wtf is a user there. Too much hassle.

        My disaster recovery is my puppetmaster git repo. Bare metal to working and "custom" configured in less than 10 minutes. Been there done that. Just saying that permitting local changes sounds like a painful system to admin. If it isn't in puppet and I'm not actively putting out a file it doesn't officially exist.

        I've had trouble getting puppet and cfengine working across multiple versions on multiple releases of multiple OS. Its not fun. Leads to swear words and implementing the 1% of puppet features that I actually use in gpg/ssh/bash/git/cron, at least at home where I only got maybe 10-20 boxes. If you think about it, puppet and friends are basically a reimplementation of that along with a megaton of feature lists for stuff I won't use. So when puppet pissed me off too much at home it got replaced with a short shell script that does little more than install/upgrade package copy config file repeat. I don't want or need a bad NIH reimplementation of ssh or gpg or git or bash, I'll just run natively... at least at home where being the only guy who understands what I'm doing it no big deal (there must be 100K users of puppet, however NIH icky it gets)

        A concept I've been fooling around with is given how cheap disk space is, do I really want hetrogenous servers? I mean I wouldn't run mysql or bind for the heck of it unless I was actually using it for security reasons, but if I configured everything to be capable of doing everything, then it could be handy in certain weird troubleshooting situations for any individual machine to temporarily "act as if" they're a complete DEV system. I've never tried this concept but its strangely appealing. Its not likely you'll run out of disk space anytime soon so configure everything to be everything, just don't turn on mysql unless its acting as if its a database server.

        • (Score: 2) by NCommander on Tuesday July 14 2015, @06:36PM

          by NCommander (2) Subscriber Badge <> on Tuesday July 14 2015, @06:36PM (#209033) Homepage Journal

          Puppet won't really help in our setup because aside from basic kerberos and hesiod setup, every node except the two frontend ones are running different server stuff. Originally we had (near) identical DB backends, but when we consolated nodes to save money, a lot of stuff was moved around and added. I'd love to add more hardware so that isn't the case, but I can't justify the expense. Puppet makes a lot of sense if you have (near) identical machines running the same software like a giant hadoop cluster. If you don't have that, I find it adds too much overhead to really be worthwhile.

          Our disaster recovery plan involves rsyncing filesystems from oxygen (our offsite backup) back to their proper node, restoring the database in case ndbd corrupts it, and crossing our fingers.

          Still always moving
    • (Score: 2) by TheRaven on Wednesday July 15 2015, @11:32AM

      by TheRaven (270) on Wednesday July 15 2015, @11:32AM (#209310) Journal

      I remember a conversation with cluster admin awhile ago, pretty much the same complaints about LDAP we all have, they needed and had an extensive puppet system for automation (or whatever they used, maybe cfengine) and they found the least painful way to handle user data was to let the config management system distribute the golden config files to the whole computational cluster.

      I can see that working for a lot of things, but for password files the end user is going to want to change them periodically on different machines and expect that to be sync'd. How did they handle this? Were they doing full syncs on update, and if so then how did they handle conflicts?

      sudo mod me up
      • (Score: 3, Informative) by VLM on Wednesday July 15 2015, @02:46PM

        by VLM (445) Subscriber Badge on Wednesday July 15 2015, @02:46PM (#209396)

        Ah OK. Traditionally password files and LDAP and all that were used mostly for login passwords. You know, for keyboard interactive telnet and ssh sessions. Like 90% of the world's login password is "Password" and we need to store that someplace.

        Now a days its all about the kerberos to do actual hand typed password stuff, and between machines its all about ssh shared keys so no password is involved. If my is in your authorized_keys I can log in no password required.

        So ironically the passwd file is all about all the demographic data that is everything but passwords. Like the numeric filesystem UID for VLM is 1003 here, or VLM user enjoys /usr/local/bin/bash more so than /bin/dash as his login shell, or /etc/groups says vlm is part of wheel group and sudoers file says wheel group members get to have their way, etc. Hand typed passwords all live in /etc/shadow since the 90s anyways which is another long story.

        The other question you had was update sync and frequency. Well thats pretty much what puppet and friends live to do, all day long, like their primary purpose is someone changes a file on the puppetmaster and within 15 minutes every machine has that new file, where the new file may very well be /etc/passwd or /etc/groups or whatever. There are other less ... forcible ways to have puppet modify passwd and group files than just overwriting files, but KISS principle, here's the golden copy of the password file now copy it everywhere.

        You'd be surprised how rarely someone changes their homedir or login shell so its not like you have to update /etc/passwd very often. Literal hand typed passwords are changed in kerberos. "between two machines" logins are all ssh key based and have nothing to do with /etc/passwd shadow or group beyond like "where is vlm's homedir so I can look at his .ssh/authorized_keys to see if "he" can log in here?" This is an interesting way to break into a non-SSL secured LDAP distributed system BTW (bad guy MITM the LDAP and says uh, sure, just today vlm's homedir is /autofs/badguy/naughty which happens to have bad guy's ssh key info not mine). Another hilarious one is a "special" LDAP response by an attacker that says today VLM's UID is going to be zero, yeah that guy.. So secure your ldap!

        Stereotypical puppet config manager style is you overwrite local changes. That's kind of the point of centralized configuration management. Its considered "good form" or "polite" to put some kind of comment at the top of any file distributed by puppet something like "# This file is distributed by puppet" and if you can chmod it to be read only to kind of drive home the point, then do so. You'll still end up with people experimenting and then wondering why all their manual changes revert at the quarter hour mark.

        I suppose it depends on cluster purpose. One cluster with one user doing one parallel thing (rendering frames of 3d movies or whatever) doesn't have the issues that some kind of institutional general purpose 1000s of different users educational cluster would have.

  • (Score: 2) by VLM on Tuesday July 14 2015, @05:45PM

    by VLM (445) Subscriber Badge on Tuesday July 14 2015, @05:45PM (#209010)

    For instance, for each user, we encode their SSH public keys as a (username).ssh TXT record.

    I'm surprised there's no standard for user keys like there is for host fingerprints (the ole SSHFP record in DNS) I always considered that a big failure of RFC 4255, why can't we publish in dns that has an of "buncha hex" and call it good? I mean not something custom but an internet RFC standard. Then any box that wants to let log into via ssh key authentication merely need pull the key from dns and run a dnssec verify to verify its clean. Assuming people in the know don't know dnssec is powned and are refusing the release this functionality but can't say due to NSL. Maybe tin foil hat too tight on head today. Or maybe not.

    And going the opposite direction a PITA about SSHFP records was stuff like kinda non-standard by DNS standards for search paths, so you could sometimes resolve a name to an ip addrs but ssh would fail to resolve the sshfp lookup (although supposedly thats all water under the bridge and long since fixed, supposedly). Like ssh wtf would succeed in resolving to an ip addrs if your /etc/resolv.conf has a search line, but there was some funkiness that really old openssh would try to pull the sshfp record for ".wtf" because it ignored search lines or whatever. Anyway the hesoid seems to handle those mapping issues automagically so it could have saved the ssh host key fingerprint in dns people some effort.

    sshfp for host keys is cool, when it works. Just saying there should be a RFC internet standard to advertise public user keys too. Hell let DNS advertise GPG keys too while we're at it.

    • (Score: 3, Interesting) by NCommander on Tuesday July 14 2015, @06:33PM

      by NCommander (2) Subscriber Badge <> on Tuesday July 14 2015, @06:33PM (#209031) Homepage Journal

      The problem with SSHFP is you're depending on an insecure medium to know you haven't gotten MITNed. That's why its disabled out of the box with OpenSSH. I've debated publishing SSHFP keys for our servers but until I get around to signing both, and our internal li694-22 domain, its not going to add a lot of security. Incidently, adding SSH keys to DNS is kinda a pain, since BIND requires a funky syntax for stuff thats longer than 255 characters. I hacked up a small python script (attached) that takes a username, and a key, and spits out the proper lines for BIND.


      import sys

      def chunks(l, n):
              n = max(1, n)
              return [l[i:i + n] for i in range(0, len(l), n)]

      password_chunks = chunks(sys.argv[2], 255)

      print sys.argv[1] + ".ssh\tIN TXT\t(",

      for i in password_chunks:
              print "\"" + i + "\"",

      print ")\n",

      Based off a script I found on StackExchange (string manipulation drives me mad and I'm lazy).

      Still always moving
      • (Score: 2) by VLM on Tuesday July 14 2015, @08:18PM

        by VLM (445) Subscriber Badge on Tuesday July 14 2015, @08:18PM (#209069)

        its not going to add a lot of security

        Of course, in a world where 99% of admins blindly outta habit hit 'y' any time they see a prompt for unknown host key...

        Assuming you trust ssh and git, I have a little script which is so simple it doesn't bear copy and pasting that stores my ssh host and user public keys in a repo. git clone the repo and get a mind dump of everything I trust in sorted directories of individual files, and it helpfully adds itself to the repo collection as a new member which I manually add at the next git commit / push. Something along the lines of cat ~/somethin/onepath/* > ~/.ssh/authorized_keys and cat ~/somethin/anotherpath/* > ~/.ssh/known_hosts around the end of that script. Also runs a git status and git pull to see/warn if anything changes. Its a little more complicated than that, but not much. Anyway thats one way to keep a "permanent record" of trusted SSH keys and notice if anything suddenly changes.

        I'm not crazy enough to git repo the private keys, just the public ones...

        I also enforce and distribute certain rules in my ssh configs this way. google "secure secure shell" or whatever its called for unusual amounts of advise WRT encryption algo suggestions etc. Some things inspired by that are part of my ssh updating script.

        I wonder if anyone has any better ideas for key distribution and monitoring and control. In my infinite spare time I've tried to think of a general system to do it.

        • (Score: 2) by NCommander on Tuesday July 14 2015, @08:39PM

          by NCommander (2) Subscriber Badge <> on Tuesday July 14 2015, @08:39PM (#209076) Homepage Journal

          If you like pain, its possible to use a GPG authetication key (type A vs. type SC which is the normal one for signing/encryption) for SSH, and then be able to pull the key with standard GPG tools, as well as revoke it if necessary. It's been a long time since I fiddled with it since it required both gpg-agent and ssh-agent, and a fair number of arcane hacks to work. I'm not even sure I remember how I made it work.

          Anyway, if you're using Kerberos, aside from the host key, the system keytab has to have a valid principle in it, else the KRB5 server won't issue a TGT which adds a second line of defense.

          Still always moving
  • (Score: 2) by tempest on Tuesday July 14 2015, @06:03PM

    by tempest (3050) on Tuesday July 14 2015, @06:03PM (#209012)

    I cringe any time OpenLDAP is brought up, although I've only tested it far enough to know I want nothing to do with it. Hesiod is one of those things I stumbled across in man pages, but never gave it much thought. I'm really going to give this a look. However, leveraging DNS this way sounds like a huge exposure to injection attacks without DNSCurve. That sounds like the solution is ready to go, but uptake of DNSCurve has been fairly low. For me I guess it's an issue with being so heavily invested with Unbound networking wise, but this has my interest piqued to give it a spin tinkering.

    • (Score: 2) by NCommander on Tuesday July 14 2015, @06:30PM

      by NCommander (2) Subscriber Badge <> on Tuesday July 14 2015, @06:30PM (#209027) Homepage Journal

      DNSCurve [] has the advantage that you don't need to sign the entire domain; just install curve as the frontend forwarding server (or servers), and make sure each client downstream has the resolver. We haven't deployed this though I'm considering doing so mostly because the only way you could inject false DNS data is to MITN one of our servers in the Linode data center. I don't know if DNSCurve properly handles DNS classes; some people may want to scope the Hesiod data and put it in the HS class vs the IN class.

      The alternative is DNSSECing the entire thing, and use locally validating resolvers. More pain, more standards conforming. I'm likely just going to sign the entire li694-22 domain which we use internally for node-to-node communication. Prevents me from having to fuck with the DNS setup on helium and boron.

      Still always moving
    • (Score: 2) by zeigerpuppy on Tuesday July 14 2015, @09:10PM

      by zeigerpuppy (1298) on Tuesday July 14 2015, @09:10PM (#209099)

      Thanks for the article, always nice to hear stories from the production mill (or maybe brick conveyor is more appropriate). The networks I administer are too small to make a centralized user management system necessary but really nice to know the alternatives (Esp because I just thougt I was clueless last time I played with openldap).

      • (Score: 2) by Yog-Yogguth on Saturday July 18 2015, @11:16AM

        by Yog-Yogguth (1862) Subscriber Badge on Saturday July 18 2015, @11:16AM (#210746) Journal

        Same here, loved reading the review and comments.

        Bite harder Ouroboros, bite! linux USB CD secure desktop IRC *crypt tor (not endorsements (XKeyScore))
  • (Score: 1) by ptman on Wednesday July 15 2015, @01:21PM

    by ptman (5676) on Wednesday July 15 2015, @01:21PM (#209358)

    I like LDAP. Mostly OpenLDAP, but I guess I would try FreeIPA for the next deployment. The reason is that lots of software can authenticate against LDAP. And by extending LDAP I've avoided adding lots of different databases that need to be kept in sync.

    • (Score: 2) by NCommander on Thursday July 16 2015, @04:07PM

      by NCommander (2) Subscriber Badge <> on Thursday July 16 2015, @04:07PM (#210005) Homepage Journal

      It's true most things can authenticate against LDAP directly; sudo-ldap immediately popped into my mind. However, a *lot* of things on UNIX-likes either use PAM, or name services switch to provide authetication, which means that whatever the backend is, it will"just work". As it stands, for a lot of things, I prefer kerberos for handling password/authetication information because it can transparently follow you across a network. Sign in once, then SSH to another machine, and your authentication tokens automagically follow you.

      Still always moving