We just learned that our VM provider, Linode, had to perform some emergency reboots. Three of our servers have already been taken care of, but more are still to come. This led to our site being unavailable for a period of approximately an hour. Here is the reboot schedule:
Identified - Linode has received a Xen Security Advisory (XSA) that requires us to perform updates to our legacy Xen host servers. In order to apply the updates, hosts and the Linodes running on them must be rebooted. The XSAs will be publicly released by the Xen project team on July 26th. We must complete this maintenance before then.
Here's the schedule and status:
Server | Purpose | Maintenance Schedule (UTC) |
---|---|---|
lithium | Development | Completed |
magnesium | Frontend Proxy | Completed |
sodium | Frontend Proxy | Completed |
fluorine | Production Cluster | Completed |
helium | Production Cluster | Completed |
hydrogen | Production Cluster | Completed |
neon | Production Cluster | Completed |
beryllium | Services Cluster | Completed |
boron | Services Cluster | Completed |
We apologize for any inconvenience.
[Update: It appears the second round of reboots has completed successfully, and, thanks to the advance notice, the site stayed up throughout. We anticipate that the site will still continue to operate normally through the last-scheduled reboot. Many thanks for your understanding and patience.]
[Update #2: We are taking advantage of a free offer from Linode, our hosting provider, to convert our VPSs (Virtual Private Servers) from Xen to KVM. The rebooting was required to repair a Xen vulnerability. As a bonus, the Xen to KVM conversion gives us a free upgrade to twice as much memory. The additional memory will provide much-needed additional headroom on our servers and possibly provide a performance improvement. Thanks to our redundancy, the changes should not be noticeable when we reboot/upgrade, except for the IRC and e-mail servers as they are single-hosted.]
[Update #3: Thanks to the tireless efforts of paulej72 well into the wee hours of this morning with able assistance by audioguy in straightening out some IP issues as well as Deucalion and TheMightyBuzzard providing guidance and support, all but two of our Xen servers have been upgraded to KVM. This free upgrade doubled the amount of memory available to our VMs, giving us some much-needed headroom. That leaves beryllium (IRC and email) and boron (DNS, Hesiod name service) as the two servers that have not been upgraded yet. Date/time is TBD.]
[Update #4: Boron will be reconfigured shortly, and then Bberyllium after that. Plan on an hour or two, though, obviously, we'll try to keep the downtime to a minimum!]
[Update #5: Boron's second upgrade for the ram sat in the queue for several hours, so Beryllium had to wait until paulej72 got up and finished it this morning (0830 EDT)]
(Score: 0) by Anonymous Coward on Friday July 22 2016, @04:45AM
So you don't have to log in and launch stuff manually in a panic. If you forget a startup script we'll enjoy some extra downtime. You know, for kids!
(Score: 0) by Anonymous Coward on Friday July 22 2016, @08:56PM
This is my systemd configuration on Gentoo. I've found that it makes Linode automated reboots a breeze.
File: /etc/portage/package.mask
# Lennart Poettering
sys-apps/systemd
net-misc/networkmanager
media-sound/pulseaudio
(Score: 0) by Anonymous Coward on Friday July 22 2016, @10:00PM
I start so much stuff @reboot in root's crontab it's not even funny. OK fine the bare-bones webserver runs @reboot out of nobody's crontab instead. I wrote a lot of custom code and I'm too lazy to package it and I'm too lazy to write proper init scripts so I use cron @reboot instead.
(Score: 0) by Anonymous Coward on Friday July 22 2016, @04:50AM
All the larger Xen hosters got advanced warning; mine allowed some input to when the host was going to get booted.
(Score: 2) by isostatic on Friday July 22 2016, @07:44AM
Linode told me too, on Wednesday 1230 GMT
(Score: 2) by Gravis on Friday July 22 2016, @05:01AM
Can thinking do? щ(゚Д゚щ)
(Score: 0) by Anonymous Coward on Friday July 22 2016, @12:57PM
noyes
(Score: 2) by mhajicek on Friday July 22 2016, @05:25AM
Is the funding goal bar on the main page broken, or is no one actually contributing?
The spacelike surfaces of time foliations can have a cusp at the surface of discontinuity. - P. Hajicek
(Score: -1, Troll) by Anonymous Coward on Friday July 22 2016, @06:17AM
I pledge never to contribute funding because I believe cyberbegging is immoral and also I secretly want to see this site die.
(Score: 0) by Anonymous Coward on Friday July 22 2016, @06:19AM
>secretly
Sure, scum.
(Score: -1, Troll) by Anonymous Coward on Friday July 22 2016, @06:55AM
Oooops did I tap that out loud. I hate soystain so much. Stop it fingers stop tapping my secrets.
(Score: 0) by Anonymous Coward on Friday July 22 2016, @09:00PM
Kill yourself.
(Score: 0) by Anonymous Coward on Friday July 22 2016, @12:59PM
I see we found Buzzards scorned, ex, gay lover.
(Score: 2) by The Mighty Buzzard on Friday July 22 2016, @01:26PM
Nah, nobody knows it's really you. Oops, damn.
My rights don't end where your fear begins.
(Score: 2) by Tork on Saturday July 23 2016, @12:21AM
🏳️🌈 Proud Ally 🏳️🌈
(Score: 2) by The Mighty Buzzard on Saturday July 23 2016, @12:40AM
Scorned means I rejected him. Let that be a lesson to anyone fool enough to drink the last beer.
My rights don't end where your fear begins.
(Score: 2) by Tork on Saturday July 23 2016, @12:41AM
My bad!
🏳️🌈 Proud Ally 🏳️🌈
(Score: 1, Troll) by Azuma Hazuki on Friday July 22 2016, @06:24PM
No one could love him.
I am "that girl" your mother warned you about...
(Score: 0) by Anonymous Coward on Friday July 22 2016, @08:25PM
Buzzards scorned ex-gay lover turned straight but doesn't know yet the new hottie is a trans.
(Score: 2) by The Mighty Buzzard on Friday July 22 2016, @08:41PM
Yeah, I kinda felt bad that after me no other man could compare. That's the risk you take though.
My rights don't end where your fear begins.
(Score: 1) by kurenai.tsubasa on Friday July 22 2016, @08:48PM
Oops, my bad. Thought everybody knew. Should I give him “the talk” or would it be better if you let him know?
(Score: -1, Troll) by Anonymous Coward on Friday July 22 2016, @04:01PM
cyberfag
(Score: 4, Informative) by JNCF on Friday July 22 2016, @07:30AM
It's not broken, it just updates less frequently than you're expecting.
Effective: 2016-June to 2016-December
Updated: 2016-07-03
It will move in one big chunk to show recent contributions... when it feels like it.
(Score: 3, Informative) by The Mighty Buzzard on Friday July 22 2016, @10:28AM
What JNCF said. mrcoolbp updates it something like once a month manually.
My rights don't end where your fear begins.
(Score: 3, Informative) by isostatic on Friday July 22 2016, @07:46AM
Why not migrate to KVM? From the linode email:
Upgrading to KVM will allow you to avoid this maintenance entirely. You can use the “Upgrade to KVM” link in your Linode’s dashboard to move to KVM. Please note that KVM upgrades are not available in Tokyo at this time. More KVM upgrading information can be found here:
(Score: 0) by Anonymous Coward on Friday July 22 2016, @09:50AM
Switching to KVM changes a few things (like device paths) which could break the Linode.
(Score: 2) by The Mighty Buzzard on Friday July 22 2016, @11:23AM
Free time and the desire to debug have both been short.
My rights don't end where your fear begins.
(Score: 2) by bziman on Friday July 22 2016, @02:52PM
Thank you! I'm already on the KVM system, but I didn't know that they were offering free upgrades to 2x original memory, so I just logged in and got my free upgrade. Fantastic!
(Score: 0) by Anonymous Coward on Friday July 22 2016, @02:40PM
The new hotness: Dedicated hosting.
For example:
https://www.1and1.com/server-dedicated-tariff?__lf=Order-Product#stage-end [1and1.com]
That L4i option would, for example, pummel your Linode mercilessly into submission (performance wise) while providing you full control over your uptime. You'd also be exempted from strange and often undiscovered classes of cross-VM vulns.
Join the revolution!
(Score: 0) by Anonymous Coward on Friday July 22 2016, @03:04PM
And before anyone says "b-b-b-b-b-but look at the server list! we got almost ten dev/production/test/proxy machines!" I say: You wouldn't need em with the ole L4i. Trust me.
(Score: 2, Funny) by mechanicjay on Friday July 22 2016, @06:49PM
Fuck that. We should be porting the whole thing to a series of docker containers.
My VMS box beat up your Windows box.
(Score: 0) by Anonymous Coward on Friday July 22 2016, @08:37PM
Docker is for brogrammers who want to impress the team. Real hackers work alone in the basement with naked bare metal.
(Score: 4, Insightful) by tibman on Friday July 22 2016, @07:55PM
Dedicated hosting is a throwback to ~2002. When a fan fails and your server goes down for two days waiting for an underpaid tech to pull it from the rack to fix it, lol.
SN won't survive on lurkers alone. Write comments.
(Score: 3, Interesting) by archfeld on Sunday July 24 2016, @04:22AM
Speaking as an underpaid tech I relish when something this basic and fixable happens. It sure beats the shell out of endless project meetings dominated by moron PM's and retarded managers. One of the greatest feelings is when I get an text message during one of those forever meetings that a simple disk or power supply or fan has failed and I can excuse my self and run gratefully into the cool quietness of the server farm or lab space to do some actual work.
For the NSA : Explosives, guns, assassination, conspiracy, primers, detonators, initiators, main charge, nuclear charge
(Score: 2) by opinionated_science on Friday July 22 2016, @04:01PM
some proactive security patching, rather than the media panic that is usually the opening volley...
Of course, it makes you wonder how many exploits are still present but have not been found/released....
(Score: 3, Funny) by Runaway1956 on Friday July 22 2016, @04:03PM
~blame
“I have become friends with many school shooters” - Tampon Tim Walz
(Score: 2) by archfeld on Friday July 22 2016, @09:20PM
for about an hour, haven't verified via the router logs yet but everything seems to be happy now.
For the NSA : Explosives, guns, assassination, conspiracy, primers, detonators, initiators, main charge, nuclear charge
(Score: 2) by maxwell demon on Friday July 22 2016, @09:29PM
Hydrogen has a star after "completed". Why?
The Tao of math: The numbers you can count are not the real numbers.
(Score: 2) by martyb on Friday July 22 2016, @10:23PM
Short answer: Sleep deprivation; fixed.
Long answer: Intended to use the [*] to flag the last remaining server that had been slated to need rebooting... but for which it was no longer necessary. Two reasons: It got rebooted as we took advantage of a double-your-memory-for-free offer if we converted from Xen to KVM for hosting. To do that, required rebooting the server. But, once done, there was no longer a need to reboot the server to apply a fix for Xen, because it was not on Xen any more. In the course of updating the story to keep the community informed, accidentally left in the star and omitted the footnote. Star is now removed.
Thanks for the hawk eyes!
Wit is intellect, dancing.
(Score: 1) by cngn on Friday July 22 2016, @10:51PM
Keep up the good work guys, we need more selfless people on the net, and I'm grateful to you.
If ever one of you is in Sydney where I am drop me a line we can always meet up and I'll buy the beer.
(Score: 0) by Anonymous Coward on Friday July 22 2016, @11:14PM
If you don't mind me asking, why were the reboots spread over several hours rather than doing them around the same time?
(Score: 2) by martyb on Saturday July 23 2016, @03:18AM
Apparently, all of our servers (VMs), are not all on the same physical server / rack / whatever unit they reboot/reload. So some go down at the same time and others don't.
Point is, they schedule and we try to work around it. We just didn't see the notice until after the reboots started. :(
Hope that helps!
Wit is intellect, dancing.
(Score: 0) by Anonymous Coward on Saturday July 23 2016, @05:53PM
Some VPS providers also spread out customers for a few reasons. They big two are for load balancing, like a compute server sharing space with a storage node, and to prevent random failures from taking big customers with multiple machines out at once, similar to why they don't reboot all machines at once, even if they theoretically could.
(Score: 2) by martyb on Monday July 25 2016, @03:37PM
Excellent points! Thanks to our having redundancy in our configuration, with sufficient advance notice, we can deal with a server or two going down without issue. Were all of our VPSs hosted on the same physical machine, we'd lose that ability. It is actually to our benefit to have things spread out across multiple physical servers. Sure, there are degenerate cases, but there is at least the possibility that in many cases we can remain up even with some number of our servers down.
Wit is intellect, dancing.
(Score: 2) by Snotnose on Sunday July 24 2016, @04:06AM
OK, I'm a lowly embedded device driver and linux kernel guy who doesn't know squat about making webpages like this one. But it takes 9 servers to handle this site? WTF? What the hell are they all doing?
Not trying to be sarcastic or anything, I honestly don't get it. I've written software using embedded linux that did more than I imagine this website does, but my stuff all ran on a single CPU with path to RAM and HDD storage.
Bad decisions, great stories
(Score: 2) by NCommander on Sunday July 24 2016, @06:04AM
We actually get a lot of traffic and a lot of requests per traffic, and right now manage about 40-50% of capacity between the two of them. The main servers are 2x2 configuration, allowing us to offline one without taking the entire site. Magnesium is our current front-end load balancer between the two doing SSL termination, with sodium as a hot standby (these are Linode 1024s, or were). Lithium is an independent development site, and IRC/wiki/etc are isolated on seperate servers as rehash can't easily co-exist with other things on the same box.
Still always moving