from the from-the-crack-team-of-flying-monkeys dept.
Our nodes are named after elements in the periodic table, starting with hydrogen, and going up from there; named roughly in the order they were brought online. With two exceptions, we're standardized on Ubuntu 12.04 Precise Pangolin. Nodes dedicated to running slashcode are Linode 4096s, with everything else being Linode 2048s due to Linode's recent free upgrade.
Where possible, all services (with the exception of MySQL) are high-availability, and can survive any node suddenly flaking out. This includes our internal DNS, LDAP, Kerberos, gluster, web frontends, and slashd*. It is our goal to get us to a 100% HA configuration so we can easily offline nodes, or upgrade systems without any interruption in service though we're still somewhat short of that (mostly due to limitations in MySQL).
User management and single-sign-on are handled by a combination of kerberos and LDAP, with SSH keys for users storied in the LDAP backend with a bit of voodoo to allow them to be dynamically loaded whenever staff wish to access a machine. Service accounts (i.e., slash or icinga) use Kerberos keytabs to perform passwordless authentication to allow us to be able to centrally revoke and replace any compromised keys instead of playing the age-old game of editing authorized_keys in 20 places.
Furthermore, we use AppArmor quite extensively internally to try and keep ourselves relatively well protected. Its no secret that we're currently stuck on an outdated Perl and Apache which no longer receives security updates. While we have plans to work through, and migrate to mod_perl 2, the frontend is horribly tied to Apache (including hooks in various stages of the httpd lifecycle). I plan to run a dedicated article about this, but lets just say its a bit in-depth.
The li694-22 Domain
I've mentioned this on comments, and its on the wiki as well, but we use an internal gTLD for referencing nodes throughout the backend. Every node can access each other at hostname.li694-22. The name itself is a reference to the original private URL which we used for bringing up Slashcode way back before SN was decided as our temporary name. We have full forward and reverse resolution available, and only publish AAAA records for normal services. Oh yeah, about that ...
Use of IPv6 internally
Yeah, we were serious when we axed IPv4 internally. Since that article was written, we've had to re-introduce IPv4 addressing for the internal webservices (via ipv4.hostname.li694-22) due to compatibility issues with gluster. Using IPv6 internally allows us to have kerberos and other IP dependent services work properly from multiple places across the internet such as our off-site backup box.
Anyway, enough that, let's get a look at the machines themselves:
- hydrogen/fluorine - web frontends
- helium/neon - database backends
- beryllium - wiki host + mail accounts; runs CentOS 6
- boron - gluster+slashd
- carbon - IRC server
- nitrogen - tor proxy (also runs staff slash)
- oxygen - off-site backup
- lithium - dev.soylentnews.org, running Ubuntu 14.04
As you can tell, its quite a bit of virtual iron that keeps this site up and running. We've got considerable excess capability at the moment, so I'm not too worried about us having to bring up additional frontends any time soon, and we're trying to keep it that if half our web/DB servers were offline, we'd still be able to remain up and functional. Perhaps its a bit overkill, but you never know when you need to bring a node offline
The next article is going to go somewhat in-depth into the system administration aspects, including a hands-on look at our Kerberos, LDAP, and Iciniga instances, including a brief overview of each of these technologies in turn. Many who have worked on staff had no previous experience with kerberos in a UNIX-like environment, which I consider unfortunate, since it can drastically simplify administration burdens. Drop your questions below, and I'll either answer them inline, or later in this series of articles. Until the next time, NCommander, signing off.