Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 15 submissions in the queue.
Meta
posted by martyb on Tuesday October 20 2015, @11:12AM   Printer-friendly
from the wish-us-luck! dept.

Hello fellow Soylentils!

[Update:] We survived all three days of reboots without major issues. Many thanks to all who prepped the systems, prodded things along, and were on standby to deal with any unforeseen issues!

We were informed by Linode (our hosting provider) that they needed to perform some maintenance on their servers. This forces a reboot of our virtual servers which may cause the site (and other services) to be temporarily unavailable.

Here is the three-day reboot schedule along with what runs on each server:

Status Day Date Time Server Affects
Done Tues 2015-10-20 0200 UTC boron DNS, Hesoid, Kerberos, Staff Slash
Done Tues 2015-10-20 0500 UTC beryllium IRC, MySQL, Postfix, Mailman, Yourls
Done Wed 2015-10-21 0500 UTC sodium Primary Load Balancer
Done Wed 2015-10-21 0500 UTC magnesium Backup Load Balancer
Done Wed 2015-10-21 0700 UTC neon Production Back End, MySQL NDB cluster
Done Thu 2015-10-22 0200 UTC hydrogen Production Front End, Varnish, MySQL, Apache, Sphinx
Done Thu 2015-10-22 0500 UTC helium Production Back End, MySQL NDB, DNS, Hesoid, Kerberos
Done Thu 2015-10-22 0900 UTC fluorine Production Front End, slashd, Varnish, MySQL, Apache, ipnd
Done Thu 2015-10-22 1000 UTC lithium Development Server, slashd, Varnish, MySQL, Apache

We apologize in advance for any inconvenience and appreciate your understanding as we try and get things up and running following each reboot.


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by NCommander on Monday October 19 2015, @06:27PM

    by NCommander (2) Subscriber Badge <michael@casadevall.pro> on Monday October 19 2015, @06:27PM (#251929) Homepage Journal

    This should have limited impact on the site itself; automatic IP failover with Linode is a bit dodgy, and it doesn't work with IPv6, so when our LBs go down, the site is going to down. soylentnews.org otherwise has no single points of failure, so other than decreased performance, only our secondary services should be impacted by this.

    --
    Still always moving
    • (Score: 2) by isostatic on Monday October 19 2015, @07:41PM

      by isostatic (365) on Monday October 19 2015, @07:41PM (#251974) Journal

      Isn't the fact you've got both load balancers going down at the same time an issue?

      Wed 2015-10-21 5:00:00 AM UTC sodium Primary Load Balancer
      Wed 2015-10-21 5:00:00 AM UTC magnesium Backup Load Balancer

      • (Score: 2) by NCommander on Monday October 19 2015, @08:37PM

        by NCommander (2) Subscriber Badge <michael@casadevall.pro> on Monday October 19 2015, @08:37PM (#252017) Homepage Journal

        It's more complicated than that. The secondary loadbalancer is a manual failover, not automatic. Heartbeat and other hotfailover solutions don't appear to work on Linode's internal network. The secondary is mostly meant if we need to offline the primary loadbalancer for an extended period.

        Furthermore, Linode still doesn't support IPv6 failover, and about 10% of site traffic is v6, which means the only way to redirect that traffic is to change the AAAA record.

        --
        Still always moving
        • (Score: 2) by sjames on Tuesday October 20 2015, @09:26PM

          by sjames (2882) on Tuesday October 20 2015, @09:26PM (#252477) Journal

          Reboot magnesium, verify it, do the failover (reversing their roles), let the DNS propagate, and then reboot sodium (now the secondary)?

  • (Score: 2, Interesting) by jon3k on Monday October 19 2015, @06:36PM

    by jon3k (3718) Subscriber Badge on Monday October 19 2015, @06:36PM (#251931)

    What's the point of "the cloud" and all this redundancy when you still have reboots that cause outages?

    • (Score: 0) by Anonymous Coward on Monday October 19 2015, @06:46PM

      by Anonymous Coward on Monday October 19 2015, @06:46PM (#251938)

      At least they don't take weekends off (no new stories for 2 days) like that other site does.

      • (Score: 1, Flamebait) by wonkey_monkey on Monday October 19 2015, @06:49PM

        by wonkey_monkey (279) on Monday October 19 2015, @06:49PM (#251940) Homepage

        like that other site does.

        What, Slashdot? You did mean Slashdot, didn't you?

        Huh. Look at that. I said Slashdot twice (three times now) and nothing bad happened.

        --
        systemd is Roko's Basilisk
        • (Score: 2, Funny) by Anonymous Coward on Monday October 19 2015, @06:55PM

          by Anonymous Coward on Monday October 19 2015, @06:55PM (#251943)

          Does commander taco popup behind you and murder you?

        • (Score: 4, Funny) by DECbot on Monday October 19 2015, @07:03PM

          by DECbot (832) on Monday October 19 2015, @07:03PM (#251946) Journal

          You said the word that the knights of Soylent cannot stand to hear. You said the word again! Stop saying the Worrrd!

          --
          cats~$ sudo chown -R us /home/base
        • (Score: 5, Funny) by maxwell demon on Monday October 19 2015, @07:13PM

          by maxwell demon (1608) on Monday October 19 2015, @07:13PM (#251952) Journal

          and nothing bad happened.

          Say you. But the moment you typed that word, a freak wormhole opened up in the fabric of the space-time continuum and carried thi words far far back in time across almost infinite reaches of space to a distant Galaxy where strange and warlike beings were poised on the brink of frightful interstellar battle.
          The two opposing leaders were meeting for the last time.

          A dreadful silence fell across the conference table as the commander of the Vl'Hurgs, resplendent in his black jewelled battle shorts, gazed levelly at the the G'Gugvuntt leader squatting opposite him in a cloud of green sweet-smelling steam, and, with a million sleek and horribly beweaponed star cruisers poised to unleash electric death at his single word of command, challenged the vile creature to take back what it had said about his mother.

          The creature stirred in his sickly broiling vapour, and at that very moment the word naming that other site drifted across the conference table.

          Unfortunately, in the Vl'Hurg tongue this was the most dreadful insult imaginable, and there was nothing for it but to wage terrible war for centuries.

          Congratulations, you started an interstellar war.

          --
          The Tao of math: The numbers you can count are not the real numbers.
          • (Score: 0) by Anonymous Coward on Monday October 19 2015, @08:36PM

            by Anonymous Coward on Monday October 19 2015, @08:36PM (#252015)

            Ha! I had a 6 hour drive last Thursday and listened to the original radio play version of HHGTTG. Hadn't heard it in years, still very good! A friend digitized and de-noised my original set (12 half-hour episodes), which I originally taped off the local NPR FM station (in USA).

            Note that this is different/better than the commonly available radio play which was remade at some point. We suspect (but don't know for sure) that the original version contains material/music that is (C) by others, which the BBC didn't want to license for public sale...

            • (Score: 2) by wonkey_monkey on Monday October 19 2015, @09:48PM

              by wonkey_monkey (279) on Monday October 19 2015, @09:48PM (#252068) Homepage

              Note that this is different/better than the commonly available radio play which was remade at some point.

              It wasn't remade, as far as I can ascertain, but bits of episode three were cut because Marvin hums/sings copyrighted tunes(!). Various releases have had the opening theme replaced with a re-recording, and/or been otherwise remastered, but I can't see that they've ever been back and remade any of it.

              Also the original commercial releases had their pitch altered slightly by mistake.

              --
              systemd is Roko's Basilisk
        • (Score: 3, Funny) by DeathMonkey on Monday October 19 2015, @07:17PM

          by DeathMonkey (1380) on Monday October 19 2015, @07:17PM (#251956) Journal

          Bloody Mary only comes out at night. (Won't) see you tomorrow!

        • (Score: 2) by isostatic on Monday October 19 2015, @08:10PM

          by isostatic (365) on Monday October 19 2015, @08:10PM (#251997) Journal

          You know thats what Hermoine used to say, before they put a taboo on the name and deatheaters trapped her in an alley and raped her.

          Do you really want dice to do that?

        • (Score: 2) by kurenai.tsubasa on Tuesday October 20 2015, @03:05PM

          by kurenai.tsubasa (5227) on Tuesday October 20 2015, @03:05PM (#252320) Journal

          Candlejac^hj%$#@+++NO CARRIER

      • (Score: 2) by takyon on Monday October 19 2015, @08:23PM

        by takyon (881) <takyonNO@SPAMsoylentnews.org> on Monday October 19 2015, @08:23PM (#252009) Journal

        I just looked at Slashdot and they had stories this weekend. Is it random weekends?

        --
        [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
        • (Score: 2) by bryan on Tuesday October 20 2015, @12:01AM

          by bryan (29) <bryan@pipedot.org> on Tuesday October 20 2015, @12:01AM (#252114) Homepage Journal

          The AC was likely referring to the other other site. The one with run by that shady character and, heaven forbid, did have a 2 day gap in the stories.

    • (Score: 3, Informative) by isostatic on Monday October 19 2015, @07:57PM

      by isostatic (365) on Monday October 19 2015, @07:57PM (#251990) Journal

      The cloud means many things. If you're trying to get funding from a PHB it means a VM that you run on your laptop (private mobile cloud to leaverage the synergies of multi-modal value-adding fusion!).

      There's then the traditional hosting, but virtualised where it's cheaper to buy a VM from someone who can benefit from economies of scale, rather than run your own servers in a colo. There's no real difference between SN having 9 real HP DL360s using up 9RU and 18 network ports in someones rack, and having SN having 9 VMs, other than the VMs have less unexpected downtime (hardware failures), and have more known downtime (VM security upgrades in this case)

      Then there's what I determine is real cloud, which is designing your entire software end-to-end to avoid any single points, and have it automatically scale and heal itself (and kill bits of itself too, to avoid nasty surprises down the line). This means having multiple servers in multiple physical locations, ideally with multiple providers. You have 3 or 4 machines running in amazon east coast, and 3 or 4 in google west coast, and a few on linode in singapore and a couple in rackspace in London.

      DNS (distributed across multiple servers as normal) points to any of those locations that have "checked in" recently, with a short TTL, and is answered by load balancers. If a location fails to check in, the record is removed and you're back up in a few seconds.

      Tasks are spread across the machines, you have a distributed management that can spin up new servers as load increases, and shuts them down as load decreases. If you had 3 webservers for example, you might need more as the number of concurrent hits on your installation heads up to say 600, so you spin up a couple more. As it drops back to 200, you drop two servers off.

      If a server dies (power outage, software update, etc), that's fine, as new ones spin up automatically in 20 seconds in another area, and depending on your budget you'll overprovision stuff. In the example above, if amazon goes titsup, a new server it automatically created on linode and the service continues. If an earthquake hits the west-coast, traffic is diverted in seconds to the other 3 locations.

      At least that's what I understand as cloud, open for any dissenting views. After a few years mainly using old-school IT (real iron, with most of the kit has physical sdi, gpi and audio interfaces), as a small part of my job doing broadcast I'm now back in a department that is solely infrastructure and devops. I'm trying to learn enough about things like ec2, openstack, docker, puppet, ansible, rabbitmq, etc. to understand the best way to go about things.

    • (Score: 3, Informative) by NCommander on Monday October 19 2015, @08:39PM

      by NCommander (2) Subscriber Badge <michael@casadevall.pro> on Monday October 19 2015, @08:39PM (#252018) Homepage Journal

      The cloud is cheaper for small businesses who can't afford a CoLo solution or similar, or require additional processing power on demand. I've always been spectical of moving from a private server room to the cloud, but at SN has no physical assets, we don't have much of a choice.

      --
      Still always moving
  • (Score: 2) by Subsentient on Monday October 19 2015, @09:02PM

    by Subsentient (1111) on Monday October 19 2015, @09:02PM (#252031) Homepage Journal

    I don't understand why SN needs so many servers. It's not a particularly *high traffic* site, nor is it running some giant computational framework, so, don't know why we have more than 3 at very most.

    --
    "It is no measure of health to be well adjusted to a profoundly sick society." -Jiddu Krishnamurti
    • (Score: 5, Informative) by NCommander on Monday October 19 2015, @09:23PM

      by NCommander (2) Subscriber Badge <michael@casadevall.pro> on Monday October 19 2015, @09:23PM (#252044) Homepage Journal

      Redundancy. With the exception of the load balancer, we can pull the plug on any machine, and the site stays up and functioning. At a minimium, that requires two web front-ends, and two databases. We've had machines go out of service for an extended period (hydrogen was down for an extended period requiring a full rebuild ultimately). Being able to repair a server without the stress of knowing the site is completely down is worth its weight in gold.

      The rest has been driven by the fact that you can only cram so much in 2-4 GiB of RAM. In total, five machines drive the site normally, one load balancer, two web frontends, two db backends. The rest is the mail server+IRC, an independent development server, and misc services box like tor.

      --
      Still always moving
    • (Score: 2) by VLM on Tuesday October 20 2015, @12:56PM

      by VLM (445) on Tuesday October 20 2015, @12:56PM (#252269)

      At $workplace I enjoy being able to halt DB server #7 with transparent failover, clone it in the NAS and rename the clone to "test DB", upgrade or otherwise F around with the test DB, swap it in for DB#2 in production once I trust the changes, or delete the test DB image and start over, then more or less do the same with puppet, then unleash the puppet.

      You could integrate it into one box vertically, but things get complicated when you mix multiple people doing multiple things on multiple projects and then do IP address level operations to swap test/dev/prod all around.

      At legacy companies and sites, the middlemen get in the way with virtual stuff just as much as they used to in the physical era, but cloud-i-ness doesn't have to be as screwed up as the old days. So "its no big deal" to spin up virtual servers as part of day to day operations unless the legacy middlemen are still standing in the way and try to turn creating a simple little image into some kind of insane capex purchase project.

      A good analogy from the old days when I worked in a dinosaur pen is purchasing more mainframe DASD is a major departmental project, but this is the era of a secretary picking up a box of blank floppy disks on the way to work, so there are different mindsets in how you swap things around or otherwise operate.

  • (Score: 4, Funny) by Anonymous Coward on Tuesday October 20 2015, @01:57AM

    by Anonymous Coward on Tuesday October 20 2015, @01:57AM (#252141)

    Physicists hurry to name elements, as soylent's server load increases.

    After experiencing phenomenal growth for the past several years, and now reaching a few hundred million unique visitors daily, soylentnews' server farm has expanded from 9 / 114 named elements, to 113 / 114 named elements. Physicists looked on with concern, as soylentnews suggested powering on a 114th server, logically named livermorium. "After the 114 named elements, there are just stupid placeholder names like unununium", explained an a physicist who choose to be called anonymous coward. "And something stupid like 'unununium', just wouldn't do, for a server name"

    • (Score: 4, Funny) by NCommander on Tuesday October 20 2015, @05:30AM

      by NCommander (2) Subscriber Badge <michael@casadevall.pro> on Tuesday October 20 2015, @05:30AM (#252181) Homepage Journal

      When we ran out of elements we'll switch to fruit.

      --
      Still always moving
    • (Score: 2) by Dr Spin on Tuesday October 20 2015, @09:51AM

      by Dr Spin (5239) on Tuesday October 20 2015, @09:51AM (#252226)

      Clearly using Unobtanium as temporary replacement isn't working well either!

      --
      Warning: Opening your mouth may invalidate your brain!
  • (Score: 2) by zugedneb on Tuesday October 20 2015, @12:23PM

    by zugedneb (4556) on Tuesday October 20 2015, @12:23PM (#252260)

    I thought you run the site from pentium 3 with an SSD (why not?)...

    Pending Wed 2015-10-21 0500 UTC sodium Primary Load Balancer
    Pending Wed 2015-10-21 0500 UTC magnesium Backup Load Balancer
    What loads are you guys balancing?

    ps. with 9 servers there is room for many more trolls. I will do what I can.

    --
    old saying: "a troll is a window into the soul of humanity" + also: https://en.wikipedia.org/wiki/Operation_Ajax