Stories
Slash Boxes
Comments

SoylentNews is people

Meta
posted by martyb on Wednesday October 10 2018, @03:00PM   Printer-friendly

As you probably have noticed, our site has been a bit sluggish lately.

We are aware of the issue and are developing plans for dealing with it. The primary issue lies in the database structure and contents. On-the-fly joins across multiple tables cause a performance hit which is exacerbated by the number of stories we have posted over the years (yes, it HAS been that long... YAY!). Further, stories which have been "archived" — allowing no further comments or moderation — are still sitting in the in-RAM DB and could be offloaded to disk for long-term access. Once offloaded, there would be much less data in the in-RAM database (queries against empty DBs tend to be pretty quick!) so this should result in improved responsiveness.

A complicating factor is that changing the structure on a live, replicated database would cause most every page load to 500 out. So the database has to be offlined and the code updated. That would likely entail on the order of the better part of a day. Obviously, shorter is better. On the other hand "The longest distance between two points is a short cut." We're aiming to do it right, the first time, and be done with it, rather than doing it quick-and-dirty, which usually ends up being not quick and quite dirty.

So, we ARE aware of the performance issues, are working towards a solution, and don't want to cause any more disruption than absolutely necessary.

We will give notice well in advance of taking any actions.


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 1, Insightful) by Anonymous Coward on Wednesday October 10 2018, @03:55PM (14 children)

    by Anonymous Coward on Wednesday October 10 2018, @03:55PM (#746992)

    On-the-fly joins across multiple tables cause a performance hit which is exacerbated by the number of stories we have posted over the years (yes, it HAS been that long... YAY!). Further, stories which have been "archived" — allowing no further comments or moderation — are still sitting in the in-RAM DB and could be offloaded to disk for long-term access.

    If you are having performance problems from joins between tables in an in RAM database (assuming you do mean that all of the joined tables are in RAM) then that says your DB table structure design is very very wrong.

    So, yes, this would say you should do it right, take some downtime, and make a wholesale change to a proper DB table layout appropriate for the types of queries that get run for the site.

    • (Score: 3, Informative) by The Mighty Buzzard on Wednesday October 10 2018, @04:14PM (13 children)

      by The Mighty Buzzard (18) Subscriber Badge <themightybuzzard@proton.me> on Wednesday October 10 2018, @04:14PM (#747000) Homepage Journal

      You are not entirely incorrect but I'm just not feeling the desire to completely revamp our DB structure and all the code that accesses it at the moment. The old site got around this by dumping archived stories out to mostly pre-rendered .shtml files on a daily basis. We're disinclined to take that route at the moment.

      --
      My rights don't end where your fear begins.
      • (Score: 3, Insightful) by RS3 on Wednesday October 10 2018, @07:27PM (6 children)

        by RS3 (6367) on Wednesday October 10 2018, @07:27PM (#747093)

        Indexes?

        • (Score: 2) by The Mighty Buzzard on Wednesday October 10 2018, @10:27PM (5 children)

          by The Mighty Buzzard (18) Subscriber Badge <themightybuzzard@proton.me> on Wednesday October 10 2018, @10:27PM (#747174) Homepage Journal

          Got plenty but do need a couple new ones added, yes.

          --
          My rights don't end where your fear begins.
          • (Score: 2) by RS3 on Thursday October 11 2018, @12:24AM (4 children)

            by RS3 (6367) on Thursday October 11 2018, @12:24AM (#747210)

            I don't remember knowing what database you're using, but if it's MySQL, have you tried phpMyAdmin?

            • (Score: 2) by The Mighty Buzzard on Thursday October 11 2018, @12:33AM (3 children)

              by The Mighty Buzzard (18) Subscriber Badge <themightybuzzard@proton.me> on Thursday October 11 2018, @12:33AM (#747211) Homepage Journal

              Install PHP on critical servers? Are you mad?

              --
              My rights don't end where your fear begins.
              • (Score: 2) by RS3 on Thursday October 11 2018, @01:01AM

                by RS3 (6367) on Thursday October 11 2018, @01:01AM (#747226)

                Hopping! I inherited WordPress servers, so... It hasn't hurt me in 10 years. Maybe I gots magic. :)

              • (Score: 2) by RS3 on Thursday October 11 2018, @01:03AM (1 child)

                by RS3 (6367) on Thursday October 11 2018, @01:03AM (#747228)

                BTW, you could rsync the db to a cloned but otherwise offline server (2nd Ethernet if you're really paranoid) and run phpMyAdmin on that box.

      • (Score: 2) by FatPhil on Wednesday October 10 2018, @09:38PM (1 child)

        by FatPhil (863) <{pc-soylent} {at} {asdf.fi}> on Wednesday October 10 2018, @09:38PM (#747154) Homepage
        Do you have slow queries loggin turned on?

        That will point you in the right direction, at least. You can then describe those queries to find out where additional indexes may be useful. Pretty much everything in a well-designed database that you are only interested in small bits of should be O(log(N)^(1+small)) in the size of the tables, and if we're noticing it slow down, that probably means an O(N) has crept in somewhere, which probably means a lack of an index.
        --
        Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves
      • (Score: 0) by Anonymous Coward on Thursday October 11 2018, @02:30AM (1 child)

        by Anonymous Coward on Thursday October 11 2018, @02:30AM (#747262)

        Data mining much?

      • (Score: 0) by Anonymous Coward on Thursday October 11 2018, @02:40AM (1 child)

        by Anonymous Coward on Thursday October 11 2018, @02:40AM (#747266)

        Different Anon here - That is pretty cool, and one implementation I would use as well.

  • (Score: 5, Funny) by nitehawk214 on Wednesday October 10 2018, @04:15PM (4 children)

    by nitehawk214 (1304) on Wednesday October 10 2018, @04:15PM (#747001)

    If you wrote your own database engine, you could call it martydb.

    --
    "Don't you ever miss the days when you used to be nostalgic?" -Loiosh
    • (Score: 2) by DannyB on Wednesday October 10 2018, @05:28PM (2 children)

      by DannyB (5839) Subscriber Badge on Wednesday October 10 2018, @05:28PM (#747045) Journal

      Or . . . . . you could use Microsoft Access database with VB.

      --
      To transfer files: right-click on file, pick Copy. Unplug mouse, plug mouse into other computer. Right-click, paste.
    • (Score: 0) by Anonymous Coward on Wednesday October 10 2018, @06:36PM

      by Anonymous Coward on Wednesday October 10 2018, @06:36PM (#747077)

      MartyDB is Web Scale.

  • (Score: 2) by goodie on Wednesday October 10 2018, @04:59PM (7 children)

    by goodie (1877) on Wednesday October 10 2018, @04:59PM (#747036) Journal

    Further, stories which have been "archived" — allowing no further comments or moderation — are still sitting in the in-RAM DB and could be offloaded to disk for long-term access

    If the archived stuff is sitting in memory, it's an issue indeed. There are a few ways that you could make it faster: daily/monthly job to move archiving data into separate tables, then remove archived data from live tables. You could always have a view that unions both live and archived tables to show the entire db, e.g., when performing searches if you don't have an index available.

    But overall, if everything fits in memory, what's the issue? Have you profiled your db and seen what the culprit is? Are you swapping due to low amounts of memory available? Are you flushing too many execution plans because you use plain old SQL vs prepared statements? There are a number of things that could come into play here. anyway I don't mean to pry but it'd be interesting to know what the root cause actually is and how you guys are going to address it.

    As far as updating a DB schema live, it should not be an issue, maybe you will have to go read-only for a few minutes, redirect to the slave, then update the slave and put the servers back in RW. I am of course making it sound a lot simpler than it is. Worst case, SN could be unavailable for a bit altogether, it's not like this is a daily thing to run :).

    • (Score: 2) by The Mighty Buzzard on Wednesday October 10 2018, @05:35PM (6 children)

      by The Mighty Buzzard (18) Subscriber Badge <themightybuzzard@proton.me> on Wednesday October 10 2018, @05:35PM (#747050) Homepage Journal

      ...daily/monthly job to move archiving data into separate tables, then remove archived data from live tables.

      That's the general idea, yep. With the addition of them being on-disk tables instead of being stuck in memory.

      But overall, if everything fits in memory, what's the issue?

      Multiple complex queries on bloody big tables for most every page load. Some very poorly optimized queries. Probably a few configuration embuggerances. A whole bunch of legacy cruft that leaves the site running fine with a small population but starting to cause problems at our current traffic and db-size levels.

      As far as updating a DB schema live, it should not be an issue...

      It should if you try to insert a row into a table that suddenly has fewer columns or more columns that can't be null but default to null to keep screwed up rows from being inserted. Or if you try to pull data from a missing column. Adding an index or a view we should be able to get away with but those aren't the only changes that need to be made.

      Also, we aren't using a master/slave setup. We're using mysql-cluster which is entirely in-memory except for tables you explicitly tell it to keep on disk (and BLOB/TEXT columns). Then the mysql bits are separate processes from the ndb backend bits, which basically means it's not quite so simple.

      It's not going to be a terribly difficult thing to fix but it's going to take some coding time, some testing time, and some down time.

      --
      My rights don't end where your fear begins.
      • (Score: 0) by Anonymous Coward on Wednesday October 10 2018, @09:34PM (3 children)

        by Anonymous Coward on Wednesday October 10 2018, @09:34PM (#747150)

        As far as updating a DB schema live, it should not be an issue...

        It should if you try to insert a row into a table that suddenly has fewer columns or more columns that can't be null but default to null to keep screwed up rows from being inserted. Or if you try to pull data from a missing column. Adding an index or a view we should be able to get away with but those aren't the only changes that need to be made.

        So you're saying no-problemo?

        • (Score: 2) by The Mighty Buzzard on Wednesday October 10 2018, @10:29PM

          by The Mighty Buzzard (18) Subscriber Badge <themightybuzzard@proton.me> on Wednesday October 10 2018, @10:29PM (#747175) Homepage Journal

          More or less. It should be time consuming not difficult.

          --
          My rights don't end where your fear begins.
        • (Score: 0) by Anonymous Coward on Wednesday October 10 2018, @11:31PM (1 child)

          by Anonymous Coward on Wednesday October 10 2018, @11:31PM (#747201)

          "So you're saying no-problemo?"

          I said that once. And then I learned. Murphy is always lurking in the background and ready to spring into action.

          • (Score: 0) by Anonymous Coward on Thursday October 11 2018, @02:36PM

            by Anonymous Coward on Thursday October 11 2018, @02:36PM (#747437)

            That's why you should say “won't work”. Then the only way it can go wrong is by actually working.

      • (Score: 2) by goodie on Thursday October 11 2018, @12:56AM (1 child)

        by goodie (1877) on Thursday October 11 2018, @12:56AM (#747223) Journal

        It should if you try to insert a row into a table that suddenly has fewer columns or more columns that can't be null but default to null to keep screwed up rows from being inserted. Or if you try to pull data from a missing column

        That's why I was thinking you could switch to RO during the code rollout and then go back to RW once everything is up. Of course, there is a small period of time where reads might screw up. But then again, if you rename the table and make things go through a view before dropping the view and renaming the table back, users may not notice much.

        Adding an index

        My 2 cents: I have no knowledge of your setup but it might take a while to build it and it will penalize writes over time.

        In any case, keep us posted, it's interesting to some of us :)

  • (Score: 3, Funny) by takyon on Wednesday October 10 2018, @06:02PM

    by takyon (881) <takyonNO@SPAMsoylentnews.org> on Wednesday October 10 2018, @06:02PM (#747063) Journal

    I've brought the site to its knees!

    --
    [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
  • (Score: -1) by fakefuck39 on Wednesday October 10 2018, @06:09PM

    by fakefuck39 (6620) on Wednesday October 10 2018, @06:09PM (#747067)

    the solution unpopular with non-professional hippies living on some shitty boat is simple. why don't you just add more ram to your servers? this isn't a real scale-out site, and it doesn't have what any professional would call a real database. your servers are tiny, your footprint is tiny. double the memory in your db vm or whatever you're using and be done with it. boohoo, so it'll cost you an extra $200/year. who gives a shit - that's the price of dinner for 2 people, and you're probably all fat anywise, so skip the dinner and do another night of possum stew. don't pretend this is some real production tier-1 environment. oh, right - you don't have anything better to do, and your time (and life) is worth pretty much nothing.

    when you hit a pole at the wendy's drive-through in your beater piece of shit datsun, you go to a junk yard and get a new bumper. you don't fly in a specialist from tokyo. just install softram95 and you're good to go.

  • (Score: 2) by iWantToKeepAnon on Thursday October 11 2018, @01:58PM

    by iWantToKeepAnon (686) on Thursday October 11 2018, @01:58PM (#747423) Homepage Journal
    The transparency and meta updates keep this site reputation tops in my book, thanks!
    --
    "Happy families are all alike; every unhappy family is unhappy in its own way." -- Anna Karenina by Leo Tolstoy
  • (Score: 0) by Anonymous Coward on Thursday October 11 2018, @04:59PM

    by Anonymous Coward on Thursday October 11 2018, @04:59PM (#747514)

    I have not yet noticed sluggishness. And yes I disable js/cross-site loads, so it's not that the rest of the world has gotten slower, faster. It's that SN hasn't yet crossed the 100-300ms detection threshold, for my queries/use cases.

    Much love to the soylentils who keep infrastructure going. Y'all do work that allows this community to live. (So.. you've birthed a monster? Cue Dr. Frankenfurter laughter.)

  • (Score: 0) by Anonymous Coward on Thursday October 11 2018, @10:21PM

    by Anonymous Coward on Thursday October 11 2018, @10:21PM (#747686)

    Just wanted to plug in a quick sign of appreciation too.

    The old place was nice until it wasn't. SN is very nice.

(1)