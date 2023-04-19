from the and-then-there-were-some dept.
We had a minor site hiccup today. All seems to be working, now.
We have always been open and upfront about the site, so in the interests of full disclosure here is a summary of the problem and steps taken to fix it.
tl;dr Comment counts shown for each story on the main page seem to have stopped getting updated since about midnight this morning; appears to be working now. Please accept our apologies for any who were inconvenienced.
Read on past the fold for details.
Problem: Comment counts on the main page showed "0" comments on recent stories, but opening a story showed the correct number of comments for it.
Actions Taken:
1.) Try bouncing the front-end servers to restart apache (This is a low-risk step that seems to fix a surprising number of issues).
No joy.
2.) Ask for help on the #dev channel on IRC.
Ncommander replied asking if slashd (an over-seeing daemon for the site) was running.
Looked through my log files and on the site wiki; determined that slashd should be running on server: fluorine
ps -AF | grep slashd | wc showed 32 processes
Ncommander suggested: killall -9 slashd
Try: killall -9 slashd
"No process found."
Inspection of output of PS -AF suggested this one-liner should do it:
$(ps -AF | grep slashd | awk '{print "kill -9 " $2}' )
Got most of the processes, but there still seemed to be some stragglers.
/etc/init.d/./slash stop
/etc/init.d/./slash restart
Conclusion:
Looked like it might have worked... reloaded main page... see updated comment counts!
Looks like all is working again.
It's a credit to the staff here that the site has been running so smoothly and without crashing or hiccups for... I can't remember when we last had an outage. Given that in the early days of the site we had maybe a few hours of uptime between crashes, we have come a long ways!
I'm going to assume this is one of those "have you tried turning it off and back on again" kind of problems, and unless the problem re-occurs, assume it is solved.
Need to hurry to get to work, so I apologize for the brevity of this posting.
--martyb
(Score: 2, Funny) by Anonymous Coward on Tuesday April 23, @03:28PM (1 child)
Can I get a refund for my 1 post that wasn't counted?
(Score: 3, Funny) by c0lo on Tuesday April 23, @03:37PM
Nope; but here, have a voucher for another post that will be counted.
(Score: 0) by Anonymous Coward on Tuesday April 23, @03:45PM
In general, these kinds of problems always bug me. Unless a random problem is the result of someone just dorking around on the server, it usually means there is something wrong somewhere even if it is an edge case. There will always be small unfixed bugs in whatever system, but with me it always seems to be the most obscure, random, impossible to intentionally recreate, bugs that cause the most headache.
Have you being rebooted your compooter? :P
Nice thing about an occasional full reboot it it usually rules out any environment problem in RAM caused by random messing about.
