2019-01-01 00:00:00 ..
2019-08-18 13:49:50 UTC
2019-08-19 13:33:31 UTC
We always have a place for talented people, visit the Get Involved section on the wiki to see how you can make SoylentNews better.
As you probably have noticed, our site has been a bit sluggish lately.
We are aware of the issue and are developing plans for dealing with it. The primary issue lies in the database structure and contents. On-the-fly joins across multiple tables cause a performance hit which is exacerbated by the number of stories we have posted over the years (yes, it HAS been that long... YAY!). Further, stories which have been "archived" — allowing no further comments or moderation — are still sitting in the in-RAM DB and could be offloaded to disk for long-term access. Once offloaded, there would be much less data in the in-RAM database (queries against empty DBs tend to be pretty quick!) so this should result in improved responsiveness.
A complicating factor is that changing the structure on a live, replicated database would cause most every page load to 500 out. So the database has to be offlined and the code updated. That would likely entail on the order of the better part of a day. Obviously, shorter is better. On the other hand "The longest distance between two points is a short cut." We're aiming to do it right, the first time, and be done with it, rather than doing it quick-and-dirty, which usually ends up being not quick and quite dirty.
So, we ARE aware of the performance issues, are working towards a solution, and don't want to cause any more disruption than absolutely necessary.
We will give notice well in advance of taking any actions.
Gift subscriptions from ACs (Anonymous Cowards) are working again. If you're curious what was broken, have a look.
If you attempted to make a gift subscription as an AC since early to mid May, and received an error, please try again at: https://soylentnews.org/subscribe.pl (Or click the link in the "Navigation" Slashbox).
As is standing SN policy, martyb is to blame for anything warranting blame. =) You can go about your business. Move along.
We strive for openness about site operations here at SoylentNews. This story continues in that tradition.
tl;dr: We believe all services are now functioning properly and all issues have been attended to.
Problem Symptoms: I learned at 1212 UTC on Sunday 2018-08-19, that some pages on the site were returning 50x error codes. Sometimes, choosing 'back' in the browser and trying to resubmit the page would work. Oftentimes, it did not. We also started receiving reports of problems with our RSS and Atom feeds.
Read on past the break if you are interested in the steps taken to isolate and correct the problems.
Problem Isolation: As many of you may be aware, TheMightyBuzzard is away on vacation. I logged onto our IRC (Internet Relay Chat) Sunday morning (at 1212 UTC) when I saw chromas had posted (at 0224 UTC) there had been reports of problems with the RSS and Atom feeds we publish. I also noticed that one of our bots, Bender, was double-posting notifications of stories appearing on the site.
While I was investigating Bender's loquaciousness, chromas popped in to IRC (at 1252 UTC) and informed me that he was getting 502 and 503 error codes when he tried to load index.rss using a variety of browsers. I tried and found no issues when using Pale Moon. We tried a variety of wget requests from different servers. To our surprise we received incomplete replies which then caused multiple retries even when trying to access it from one of our SoylentNews servers. So, we surmised, it was probably not a communications issue.
At 1340 UTC, SemperOss (Our newest sysadmin staff member... Hi!) joined IRC and reported that he, too, was getting retry errors. Unfortunately, his account setup has not been completed leaving him with access to only one server (boron). Fortunately for us, he has a solid background in sysops. We combined his knowledge and experience with my access privileges and commenced to isolate the problem.
(Aside: If you have ever tried to isolate and debug a problem remotely, you know how frustrating it can be. SemperOss had to relay commands to me through IRC. I would pose questions until I was certain of the correct command syntax and intention. Next, I would issue the command and report back the results; again in IRC. On several occasions, chromas piped up with critical observations and suggestions — plus some much-needed humorous commentary! It could have been an exercise in frustration with worn patience and frazzled nerves. In reality, there was only professionalism as we pursued various possibilities and examined outcomes.)
From the fact we were receiving 50x errors, SemperOss surmised we were probably having a problem with nginx. We looked at the logs on sodium (which runs Ubuntu), one of our two load balancers, but nothing seemed out of the ordinary. Well, let's try the other load balancer, on magnesium (running Gentoo). Different directory structure, it seems, but we tracked down the log files and discovered that access.log had grown to over 8GB... and thus depleted all free space on /dev/root, the main file system of the machine.
That's not a good thing, but at least we finally knew what the problem was!
Problem Resolution: So, we renamed the original access.log file and created a new one for nginx to write to. Next up came a search for a box with sufficient space that we could copy the file to. SemperOss reported more than enough space free on boron. We had a few hiccups with ACLs and rsync, so moved the file to /tmp and tried rsync again, which resulted in the same ACL error messages. Grrrr. SemperOss suggested I try to pull the file over to /tmp on boron using scp. THAT worked! A few minutes later and the copy was completed. Yay!
But, we still had the original, over-sized log file to deal with. No problemo. I ssh'd back over to magnesium and did an rm of the copy of the access.log and... we were still at 100% usage. Doh! Needed to bounce nginx so it would release its hold on the file's inode so it could actually be cleaned up. Easy peasy; /etc/init.d/nginx restart and... voila! We were now back down to 67% in use.
Finally! Success! We're done, right?
Did you see what we missed? The backup copy of access.log was now sitting on boron on /tmp which means the next system restart would wipe it. So, a simple mv from /tmp to my ~/tmp and now the file was in a safe place.
By 1630 UTC, we had performed some checks with loads of various RSS and atom feeds and all seemed well. Were unable to reproduce 50x errors, either.
And we're still not done.
Why/how did the log file get so large in the first place? There was no log rotation in place for it on magnesium. That log file had entries going back to 2017-06-20. At the moment, we have more than sufficient space to allow us to wait until TMB returns from vacation. (We checked free disk space on all of our servers.) The plan is we will look over all log files and ensure rotation is in place so as to avoid a recurrence of this issue.
Problem Summary: We had a problem with an oversized logfile taking up all free space on one of our servers but believe we have fixed it and that all services are now functioning properly and all issues have been attended to.
Conclusion: Please join me in thanking chromas and SemperOss for all the time they gave up on a Sunday to isolate the problem and come up with a solution. Special mention to Fnord666 who we later learned silently lurked, but was willing to jump in had he sensed we needed any help. Thank-you for having our backs! Further, please join me in publicly welcoming SemperOss to the team and wishing him well on his efforts here!
Lastly, this is an all-volunteer, non-commercial site — nobody is paid anything for their efforts in support of the site. We are, therefore, entirely dependent on the community for financial support. Please take a moment and consider subscribing to SoylentNews, either with a new subscription, by extending an existing subscription, or making a gift subscription to someone else on the site. Any amount entered in the payment amount field, above and beyond the minimum amount is especially appreciated!
First the good news. I just received word that janrinok, our Editor-in-Chief, is finally out of the hospital and back in his own home! He is very tired and has severe restrictions on his activities but is otherwise in excellent spirits. He very much appreciated the kind thoughts and wishes expressed by the community in our prior stories. It will still be many weeks or months before he can resume his prior level of activities on SoylentNews, but hopes to pop in once in a while to "second" stories that are in the story queue. Please join me in welcoming him back home!
Next, the good news. In janrinok's absence, the other editors have stepped up to the challenge. I'd like to call out chromas, fnord666, mrpg, and takyon who have all freely given from their spare time to make sure we have a steady stream of stories appearing here. I even saw CoolHand pop in on occasion to second some stories! teamwork++
Then, I have to bring up the good news that our development and systems staff have kept this whole thing running so smoothly. Besides the site, there is e-mail, the wiki, our IRC server, and a goodly number of other processes and procedures that make this all happen. That they are largely invisible attests to how well they have things set up and running!
Lastly, the good news. This is what's known in the press as the "silly season". Summer in the Northern Hemisphere means most educational institutions are on break, so less research is done and reported. other ventures are closed or running on reduced staffing levels. In short, the amount of news to draw from is greatly diminished. Yet, even in that environment, the vast majority of the time finds us with a selection of stories in the submissions queue to draw from.
We recently hit a low spot where I combed the web for a couple quick stories I could submit, but that has been the exception rather than the rule. Generally, we look for stories that have some kind of tech-related angle to them. The community has spoken loud and clear that there are plenty of other sites to read about celebrities, politics, and religion. We make a slight nod to politics in so much as it affects technical areas or has large scale ramifications (e.g. a story about President Trump having a meeting with Russian president Vladimir Putin would fit that description). Even then we generally try to keep it down to one story per day.
That said, if you see a story on the 'net that catches your fancy, please send it in! Feel free to draw upon titles listed on our Storybot page, then pop onto IRC (Internet Relay Chat) and simply issue the command ~arthur $code where $code is taken from the second column on the Storybot page.
Whether you contribute by submitting a story, buying a subscription, writing in one's journal, moderating or making a comment, we continue to provide a place where people can discuss, share knowledge and perspectives, and maybe learn a thing or two, too!