Stories
Slash Boxes
Comments

SoylentNews is people

Meta
posted by martyb on Sunday February 14 2021, @04:47AM   Printer-friendly [Skip to comment(s)]
from the Constants-aren't-and-variables-won't dept.

[2021-02-14 15:53:00 UTC: UPDATE added need to check apache log before doing a slash -restart]

We seem to have experienced some difficulties with the SoylentNews site.

I've noticed that both the number of hits and comments for each story do not seem to be updating.

Corrective measures taken:

  1. "Bounce" the Servers I doubted it would help, but it causes no harm to try it, so why not? And, as expected, it did not help, either.:
    This is my personal "bounce" script:
    cat ~/bin/bounce

    #!/bin/bash
    servers='hydrogen fluorine'
    for server in ${servers} ; do echo Accessing: ${server} &&  rsh ${server} /home/bob/bin/bounce ; done

    Which, in turn, runs the following script on each of the above servers:

    cat /home/bob/bin/bounce

    #!/bin/bash
    sudo /etc/init.d/varnish restart
    sudo -u slash /srv/soylentnews.org/apache/bin/apachectl -k restart

  2. Restart slash For those who are unaware, slash has its own internal implementation of what is, effectively, cron. It periodically fires off tasks that support the site's operations. But, this potentially has side-effects, so first need to check the apache error_log.

    # Go to the appropriate server:
    ssh fluorine
    # Ensure the apache log is not showing issues: tail -f /srv/soylentnews.org/apache/logs/error_log
    # Restart slash:
    sudo /etc/init.d/slash restart
    >> slashd slash has no PID file
    >> Sleeping 10 seconds in a probably futile attempt to be clean: ok.
    >> Starting slashd slash: ok PID = 3274

    NB: this failed to run to a successful conclusion when I originally tried it a few hour ago. I gave it one more try while writing this story... it seemed to run okay this time?!

Things appears to be running okay, now. Please reply in the comments if anything else is amiss. Alternatively, mention it in the #dev channel on IRC (Internet Relay Chat, or send an email to admin (at) soylentnews (dot) org.

We now return you to the ongoing discussion of: teco or ed?


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 4, Funny) by DannyB on Sunday February 14 2021, @04:58AM (8 children)

    by DannyB (5839) Subscriber Badge on Sunday February 14 2021, @04:58AM (#1112665) Journal

    Can moisture get in the servers?

    Aircraft, for example, are designed to keep moisture out. Except aircraft in Spain. Because the rain in Spain stays mainly in the plane. This has been going on since the days when conductors had to walk the length of the aircraft and ask everyone for their tickets. They would ask a woman: "Hey, where's my fare, lady?"

    Moisture in the servers might not only be water, but can be snot from a nose, or ethanol. (aka, alky haul) Someone could have hidden some in there back during prohibition.

    Microsoft tried putting a small data center under water a few years back. But it didn't make their software work any better.

    --
    Employers should not mandate wearing clothing. It should be a personal choice. It only affects me. Junk can't breathe!
    • (Score: 2) by sjames on Sunday February 14 2021, @05:12AM (1 child)

      by sjames (2882) on Sunday February 14 2021, @05:12AM (#1112671) Journal

      The Elks on the other hand live up in the hills and in the spring they come down for their annual convention. It is very interesting to watch them come to the water hole. And you should see them run when they find that it's only a water hole. What they're looking for is an Elk-a-hole.

      • (Score: 3, Funny) by DannyB on Sunday February 14 2021, @01:53PM

        by DannyB (5839) Subscriber Badge on Sunday February 14 2021, @01:53PM (#1112758) Journal

        People are complaining that fuel prices are back up to what they previously were.

        They are confused about the difference between a backup and a restore.

        Fuel prices have been restored to what they previously were.

        --
        Employers should not mandate wearing clothing. It should be a personal choice. It only affects me. Junk can't breathe!
    • (Score: 3, Insightful) by c0lo on Sunday February 14 2021, @06:57AM (5 children)

      by c0lo (156) Subscriber Badge on Sunday February 14 2021, @06:57AM (#1112698) Journal

      This has been going on since the days when conductors had to walk the length of the aircraft and ask everyone for their tickets.

      Way better than a naked conductor running above the train.

      --
      https://www.youtube.com/watch?v=aoFiw2jMy-0
      • (Score: 2) by DannyB on Sunday February 14 2021, @01:44PM (4 children)

        by DannyB (5839) Subscriber Badge on Sunday February 14 2021, @01:44PM (#1112755) Journal

        I would have modded that shocking.

        Discover the shocking secret behind why electrical cables have insulation.

        --
        Employers should not mandate wearing clothing. It should be a personal choice. It only affects me. Junk can't breathe!
        • (Score: 2) by c0lo on Sunday February 14 2021, @02:25PM (3 children)

          by c0lo (156) Subscriber Badge on Sunday February 14 2021, @02:25PM (#1112767) Journal

          Even if just medium voltage, I can't quite call 25kV moddely shocking, no.

          --
          https://www.youtube.com/watch?v=aoFiw2jMy-0
          • (Score: 2, Interesting) by fustakrakich on Sunday February 14 2021, @02:50PM (2 children)

            by fustakrakich (6150) on Sunday February 14 2021, @02:50PM (#1112776) Journal

            I can generate 25kV by walking on a shaggy carpet.

            --
            Ok, we paid the ransom. Do I get my dog back? REDЯUM
            • (Score: 2) by c0lo on Sunday February 14 2021, @10:25PM (1 child)

              by c0lo (156) Subscriber Badge on Sunday February 14 2021, @10:25PM (#1112929) Journal

              It still doesn't make the voltage moderately shocking, even if the charge amount is too small to be deadly.

              --
              https://www.youtube.com/watch?v=aoFiw2jMy-0
              • (Score: 2) by DannyB on Monday February 15 2021, @04:26PM

                by DannyB (5839) Subscriber Badge on Monday February 15 2021, @04:26PM (#1113189) Journal

                Cables running above trains should be required to have insulation. Some federal department will require this. For safety. Think of the children!

                --
                Employers should not mandate wearing clothing. It should be a personal choice. It only affects me. Junk can't breathe!
  • (Score: 2) by RS3 on Sunday February 14 2021, @05:03AM (9 children)

    by RS3 (6367) on Sunday February 14 2021, @05:03AM (#1112667)

    Sorry, I don't know the SN slash code or SN admin, but does one of those things restart the database? I'd try restarting mysql (or whatever it is). And maybe do a db integrity check... IIRC there are 2 in a cluster or rsync or something? Not something I keep track of. Heck, filesystem check never hurts, but I don't know how you'd accomplish that.

    • (Score: 5, Informative) by The Mighty Buzzard on Sunday February 14 2021, @05:11AM (8 children)

      The db is fine. Slashd has always been slightly persnickety though, so it needs restarted a couple times a year. When the comment counts aren't updating, it's always slashd. This was just martyb's first time trying to restart it and he wasn't aware that the init script had to be run as root and that the user slash (the user that all the SN-specific stuff on the web frontends runs as) doesn't have sudo perms of any sort.

      --
      My rights don't end where your fear begins.
      • (Score: 0) by Anonymous Coward on Sunday February 14 2021, @09:29AM (4 children)

        by Anonymous Coward on Sunday February 14 2021, @09:29AM (#1112719)

        If slash seems to require restarting a few times a year, does there seem to be any pattern of how many days before that seems to happen? Or is it completely random? Because if it is the former, maybe some preventative scheduled reboot would be in order. But you've probably already thought of that.

        • (Score: 0) by Anonymous Coward on Sunday February 14 2021, @11:04AM (3 children)

          by Anonymous Coward on Sunday February 14 2021, @11:04AM (#1112728)

          I've suggested various watchdogs in the past but the hiccups seem to be too random and far between to be worth the trouble.

          • (Score: 3, Interesting) by The Mighty Buzzard on Sunday February 14 2021, @11:32AM (2 children)

            Bingo. The "a couple times a year" is on average. I don't remember having to restart it in 2019 at all aside from it being restarted on server reboots.

            --
            My rights don't end where your fear begins.
            • (Score: 2) by RS3 on Sunday February 14 2021, @12:06PM (1 child)

              by RS3 (6367) on Sunday February 14 2021, @12:06PM (#1112741)

              cron.monthly job? Maybe with a file that contains a countdown variable so the restart only happens every so many months?

              Hate those kinds of workarounds though. If I had time I'd look into the code...

      • (Score: 3, Insightful) by martyb on Sunday February 14 2021, @02:26PM (2 children)

        by martyb (76) Subscriber Badge on Sunday February 14 2021, @02:26PM (#1112769) Journal

        When the comment counts aren't updating, it's always slashd.

        /me makes a mental note of this for future reference.

        This was just martyb's first time trying to restart it [...]

        I think I may have done this once before, but it's certainly not something I am entirely comfortable with.

        [...] and he wasn't aware that the init script had to be run as root and that the user slash (the user that all the SN-specific stuff on the web frontends runs as) doesn't have sudo perms of any sort.

        Was not aware that user slash "doesn't have sudo perms of any sort". And... now I know; thanks!

        --
        Wit is intellect, dancing.
        • (Score: 3, Insightful) by The Mighty Buzzard on Sunday February 14 2021, @03:02PM (1 child)

          Nod nod, we don't give slash sudo perms so we don't have to worry as much about it being an attack vector that could compromise the entire server. Not that it'd be terribly easy anyway being as the web frontends are behind an nginx reverse proxy that we're using as a load balancer. But any bit of extra security that doesn't slow stuff down or take too much effort is worth doing.

          --
          My rights don't end where your fear begins.
          • (Score: 0) by Anonymous Coward on Sunday February 14 2021, @09:49PM

            by Anonymous Coward on Sunday February 14 2021, @09:49PM (#1112920)

            Just one reverse proxy? Why not a double reverse?

  • (Score: 2) by DavePolaschek on Sunday February 14 2021, @12:23PM (5 children)

    by DavePolaschek (6129) Subscriber Badge on Sunday February 14 2021, @12:23PM (#1112744) Homepage Journal

    Evening MST yesterday, about half of the time I would try to load a page it would hand me an unstyled page, as if the CSS was failing to load (though I was on my iPad and didn’t really have a good way to debug). Don’t know if that helps at all or not, but it was a symptom...

    • (Score: 3, Informative) by drussell on Sunday February 14 2021, @01:05PM (1 child)

      by drussell (2678) on Sunday February 14 2021, @01:05PM (#1112751) Journal

      Ah, so it was still doing that into the evening? I guess nobody had fixed it still by then. I first saw it acting up at 4:something PST.

      You posted while I was posting that post below this post. :)

      • (Score: 2) by DavePolaschek on Monday February 15 2021, @12:50PM

        by DavePolaschek (6129) Subscriber Badge on Monday February 15 2021, @12:50PM (#1113121) Homepage Journal

        Well, that was 5:something MST, which I’d call evening. But then back in the days when dining out was something people did, we frequently had dinner before the blue-hair crowd, so my clock may be skewed.

    • (Score: 3, Informative) by martyb on Sunday February 14 2021, @02:37PM (2 children)

      by martyb (76) Subscriber Badge on Sunday February 14 2021, @02:37PM (#1112771) Journal

      Yes, I'd seen a couple reports of "CSS failing to load" on IRC. Whenever I tried to reproduce it, all my attempts loaded successfully with no issues. I inquired of others there, and someone else confirmed things were loading okay for them, too. I'd seen that happen a few times before, so figured whatever went sideways had somehow righted itself and gotten back in line.

      That said, thanks for mentioning it here. I'm starting to see a pattern. Every CSS "burp" does not necessarily lead to non-updating counts, *but* it does seem that every incident of non-updating counts was preceded by CSS issues. Can't prove a negative, of course, but I'll add this idea to my bag-o-tricks. Thanks again for the report!

      --
      Wit is intellect, dancing.
      • (Score: 3, Informative) by The Mighty Buzzard on Sunday February 14 2021, @03:06PM (1 child)

        I wouldn't necessarily connect those dots too quickly. For some reason dev has the slashd issues without ever having the CSS issues. It's not a mystery we couldn't look into and fix, it's just not a dire emergency if it's a thirty second fix less often than once a month.

        --
        My rights don't end where your fear begins.
        • (Score: 2) by martyb on Sunday February 14 2021, @04:13PM

          by martyb (76) Subscriber Badge on Sunday February 14 2021, @04:13PM (#1112810) Journal
          Good to know, tx!
          --
          Wit is intellect, dancing.
  • (Score: 2) by drussell on Sunday February 14 2021, @12:27PM (4 children)

    by drussell (2678) on Sunday February 14 2021, @12:27PM (#1112746) Journal

    Was your need to bounce the system later still related to the issues/symptoms I was reporting like lack of CSS rendering off and on that I first mentioned at something like 4:30am yesterday?

    [12:56:17] {drussell} No CSS from here.... Everything is just rendering in plain text
    [13:18:07] {c0lo} css evil, plain text good.
    [13:20:32] {FatPhil} if you can't read it using curl | less, it's not worth reading

    • (Score: 2) by The Mighty Buzzard on Sunday February 14 2021, @02:17PM (1 child)

      Yeah, the missing CSS issue is cleared up if you restart varnish and apache on the web frontends.

      --
      My rights don't end where your fear begins.
      • (Score: 2) by martyb on Sunday February 14 2021, @02:49PM

        by martyb (76) Subscriber Badge on Sunday February 14 2021, @02:49PM (#1112775) Journal

        For those following along at home, I believe that's the "bounce hydrogen and fluorine" which was mentioned above.

        --
        Wit is intellect, dancing.
    • (Score: 2) by martyb on Sunday February 14 2021, @02:45PM

      by martyb (76) Subscriber Badge on Sunday February 14 2021, @02:45PM (#1112773) Journal

      Yes, it is starting to look that way. See my earlier reply [soylentnews.org].

      And, thanks so much for the earlier report on IRC! I encourage anyone who sees something strange about the site's behavior to check on IRC. There's often someone idling there who can help try to corroborate an issue. And, if necessary, try to rouse someone to help out.

      teamwork++

      --
      Wit is intellect, dancing.
    • (Score: 2) by krishnoid on Sunday February 14 2021, @08:03PM

      by krishnoid (1156) on Sunday February 14 2021, @08:03PM (#1112886)

      Noting somewhere that the servers were bounced *and* that it didn't help is useful, with all the complaints about science not valuing negative results [plos.org]. It's also good practice to actually log it because per Adam Savage (and his ballistics expert, Alex Jason): "The difference between screwing around and science is writing it down."

  • (Score: 1, Funny) by Anonymous Coward on Sunday February 14 2021, @04:59PM (1 child)

    by Anonymous Coward on Sunday February 14 2021, @04:59PM (#1112826)

    Rajneesh in tech support says that works for his customers most of the time.

    • (Score: 2) by Fnord666 on Sunday February 14 2021, @05:38PM

      by Fnord666 (652) Subscriber Badge on Sunday February 14 2021, @05:38PM (#1112839) Homepage

      Rajneesh in tech support says that works for his customers most of the time.

      We need a "laughing to keep from crying" moderation.

  • (Score: 0) by Anonymous Coward on Sunday February 14 2021, @05:39PM (2 children)

    by Anonymous Coward on Sunday February 14 2021, @05:39PM (#1112841)

    Just spotted a little typo. Instead of

    sudo /etc/init.d/slash restart

    you have to do

    sudo systemctl restart shlash

    You'r' welcome

    • (Score: 2) by Fnord666 on Sunday February 14 2021, @07:27PM (1 child)

      by Fnord666 (652) Subscriber Badge on Sunday February 14 2021, @07:27PM (#1112877) Homepage

      Just spotted a little typo. Instead of

      sudo /etc/init.d/slash restart

      you have to do

      sudo systemctl restart shlash

      You'r' welcome

      systemd is evil and is the daemon that shall not be named.

      • (Score: 3, Funny) by inertnet on Sunday February 14 2021, @10:41PM

        by inertnet (4071) on Sunday February 14 2021, @10:41PM (#1112934)

        Plus AC tried to start 'shlash', whatever that is. Maybe a Sean Connery version of slash?

  • (Score: 0) by Anonymous Coward on Monday February 15 2021, @12:16AM (1 child)

    by Anonymous Coward on Monday February 15 2021, @12:16AM (#1112957)

    Soylentnews is just trying to create its own problems so that it can create its own stories and headlines to run with and break its own news stories. Now Soylentnews gets to do its own investigative journalism on itself and claim that it is the one breaking the news ;)

    (j/k obviously).

    • (Score: 1) by Eratosthenes on Monday February 15 2021, @01:49AM

      by Eratosthenes (13959) on Monday February 15 2021, @01:49AM (#1112989) Journal

      And mankind marveled at our own magnificence as we gave birth to AI. Some said, let's name it Colossus! Some suggested "Skynet". But mankind settled on "SoylentNews".

      Later, this was generally seen as a catastrophic mistake.

      --
      Ἀριθμητικὴ εἰσαγωγή
  • (Score: 0) by Anonymous Coward on Monday February 15 2021, @05:18PM

    by Anonymous Coward on Monday February 15 2021, @05:18PM (#1113205)

    I don't mean to sound grumpy, but how fucking difficult is it to document your interdependencies and write a runbook?

    Figure it out! What depends upon what? Draw some diagrams. You built this thing and you can't draw a simple hierarchical diagram of its operation?

    Basically, you are looking for dependencies.

    I prefer to illustrate the dependencies as a 'stack', in case I am being managed by someone who understands wedding cakes better than they do software and hardware interdependencies.

    At the bottom of the stack is the grounding. No reliable ground, no reliable power. Several times I've been involved in diagnosing computer problems in buildings built along the bayshore. Those salty marshes and tides play hell with electrical grounding.

    Next is the power. You can't boot without power. It can't be spiky power and the power needs to be delivered as sine waves, not triangles or squares.

    Next is the hardware. When is the last time you ran memtest86 on your servers? When is the last time you did a dd(1) of each hard drive in its entirety to assure yourself there were no bad blocks in use? Do you have diagnostic CDROMs for your servers? Do you ever use them? Memtest86 is the FIRST thing I do on ALL of my computers.

    Make sure the computer isn't clogged with dust! Make sure the fans are running! Make sure the hard drives aren't making horrible noises, too.

    Next, system resources. Make sure you aren't running out of disk space! Cleanup can and should be automated.

    Next, the database. No point in starting a web server when your database is tits up. The database comes before the web server.

    Next, the business logic (AKA 'middleware'). Are you using Java? Is the JVM running? Make sure the business logic is in communication with the database. Refer to your diagram. Identify test points and create tests for those test points. Automate it. You should be able to run a shell script and see that your business logic is working, that all of the required processes are in the process table and acting normally. Ideally it should be written as an /etc/init.d or /etc/rc.d script. Refer to other such scripts for tips on how to achieve quality start/stop scripts.

    (Odds are good that the business logic is where the startups get complicated; that would indicate that a better understanding of your business logic's interdependencies is called for. It may also be appropriate to invest in a Nagios server, to monitor interdependencies in a graphical fashion. And some cron jobs, to make sure certain pieces are running and to restart them if they are not.)

    Finally, the web server. If you're convinced you have correctly started your database and your business logic is working correctly and you have content to serve, then you can start your web server.

    There are other processes I have not addressed such as DNS and user authentication. If you are using an LDAP database to manage users, for instance, and that creates another dependency, IE, you can't start processes until you can log in and you can't log in until the LDAP database is restarted, then you need to include those in your diagrams and startup sequences and runbooks.

    Programmers adore complexity and messing with new versions but sysadmins adore consistency and reliability. When programmers are in charge of things, they tend to get horribly complicated, and when things go tits up, programmers tend to stand around and say "it SHOULD do this", relying upon some written document somewhere, whereas the sysadmin will observe, "it is NOT doing what is says it will do", and will happily rip it out and replace it with a small shell script, which is more reliable.

    I haven't followed Soylent News' architectural design that closely but I hope there is a staging environment, and maybe a bug-tracking infrastructure.

    My $0.02

    The goasl of your documentation should be to make it simple for you to restart the system after a night of heavy drinking OR to walk a clever ten-year-old child through doing the same thing.

(1)