Stories
Slash Boxes
Comments

SoylentNews is people

Meta
posted by NCommander on Monday December 05, @08:37AM   Printer-friendly
Hey folks,

Well, it's been a bit of time since the last time I posted, and well, I had to think a fair bit on the comments I received. It's become very clear that while I'm still willing to at least help in technical matters, the effort to reforge SN is much higher than I expected. In addition, given the, shall we say, lukewarm response I got to my posts and journal entries, well, I'm clearly not the right person for the job.

I think at this point, it's time to figure out who is going to lead SN going forward. After my de facto stepping down in 2020, the site has, for want of a better word, been a bit listless. At the moment, no one on staff really has the cycles to take that position on. A few people have expressed interest in the position, and I've talked with Matt, who is co-owner of the site about this. By and large, whoever fills the seat will have to figure out what, if anything, needs to change in regards to moderation policy, content, and more.

If you're interested in potentially fulfilling the role, drop me an email at michael -at- casadevall.pro, with the subject of "SN Project Leader", and include the following:

  • Who you are
  • What you want to do with the site
  • How you intend to do it
  • Why do you want to get involved

I'll leave this call for candidates open until December 14th, at which point Matt and I will go through, and figure out our short list, I'll talk to editors, and solicit more comments from the community. I'm hoping to announce a successor in early January, and formalize the transition sometime in February, which will be the site's 9th anniversary.

 
This discussion was created by NCommander (2) for logged-in users only, but now has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 4, Insightful) by shrewdsheep on Monday December 05, @12:24PM (22 children)

    by shrewdsheep (5215) on Monday December 05, @12:24PM (#1281241)

    A bit off-topic, maybe, but I think the technical side needs serious attention, too, notably backup and roll-back strategies, which need to be automated. I also think, that there should be a Meta on every DB rollback. The subscriptions should be manually recovered to have the current funding status.

    How is DB recovery handled at the moment?

    Starting Score:    1  point
    Moderation   +3  
       Insightful=2, Interesting=1, Total=3
    Extra 'Insightful' Modifier   0  

    Total Score:   4  
  • (Score: 2, Informative) by shrewdsheep on Monday December 05, @01:24PM

    by shrewdsheep (5215) on Monday December 05, @01:24PM (#1281249)

    PS: Re-reading this, I might sound a bit negative. My comment is not meant this way: All staff's work is highly appreciated. Thank you so much! Unfortunately, I am very time-strapped and cannot be part of staff in the foreseeable future, but if there would be a repo of a configuration management system dealing with backup/roll-back, I would certainly help out there.

  • (Score: 1, Informative) by Anonymous Coward on Monday December 05, @02:02PM (11 children)

    by Anonymous Coward on Monday December 05, @02:02PM (#1281252)

    How is DB recovery handled at the moment?

    From the looks of it, it's whatever that accidentally/coincidentally got backed up for other reasons (e.g. VM snapshot before some change). Or like those organizations that call up all their staff and ex-staff to see if anyone has some backups somehow. 🤣

    After all this is not the first or even second time it's happened.

    • (Score: 4, Interesting) by janrinok on Monday December 05, @03:08PM (10 children)

      by janrinok (52) Subscriber Badge on Monday December 05, @03:08PM (#1281260) Journal

      Until fairly recently, we were paying for (and getting) an regular automatic Linode backup which mechanicjay had used several times. I think that one of the problems was that the backup was everything in one image - so extracting, say, the sql wasn't straightforward. But this is not something that I have any real knowledge of so I will leave it at that.

      The sql was also backed up regularly but I have no details of where or how. These seem to have either vanished or simply not be found in the last few weeks. I don't know how big the sql is, but I would have thought that a daily backup during the quieter hours should be achievable without too much of a problem.

      Since the last problem a few days ago I have written software to backup the submission queue and the processed story queue every 6 hours so as to ease the task of recovery in the event of another failure. It is possible to backup comments but there is no easy way to reinsert them with meaningful names. Each person can save comments as their username or as Anonymous Coward - but not as somebody else. However, they should be saved as part of the database itself so we shouldn't be losing them - but we are. It is frustrating because it is the comments that contain all the interest to me and I suspect many others.

      • (Score: 3, Informative) by RS3 on Monday December 05, @05:28PM (9 children)

        by RS3 (6367) on Monday December 05, @05:28PM (#1281289)

        Please see my other post in this discussion regarding mysql backup. It's pretty easy, and it'd done by cron.daily scripts. It could be done cron.hourly, or any other desired time increment.

        As I mention in the other post, the sql dump is then gzipped. Since sql is text, it compresses extremely well.

        I have many more thoughts on this, but no time. Basically I'd want to make very frequent incremental backups... Again, the backup file is literally sql, so super compressible, so it's not going to fill up drives / media (even full backups are tiny).

        • (Score: 1) by shrewdsheep on Monday December 05, @06:51PM (4 children)

          by shrewdsheep (5215) on Monday December 05, @06:51PM (#1281298)

          My suggestion is to put the configuration (best as CMS scripts like ansible)/dockerfile online to spin up an instance of soylentnews on github/lab etc. Then you don't have to do the work yourself.

          • (Score: 3, Informative) by janrinok on Monday December 05, @07:48PM

            by janrinok (52) Subscriber Badge on Monday December 05, @07:48PM (#1281313) Journal

            Containerisation is exactly what NCommander is intending to do.

          • (Score: 4, Insightful) by janrinok on Monday December 05, @07:58PM (2 children)

            by janrinok (52) Subscriber Badge on Monday December 05, @07:58PM (#1281314) Journal

            Of course, we cannot provide docker containers with all the personal database data already inside them. That would hardly be way of keeping everybody's personal data private, now would it? There is also a series of privileges that need to be allocated to specific users depending on their role or function.

            However, generating fake data is possible - we do it on the development system so that we can create reproducible data for testing. It is quite time consuming for the first time though. But it gives anyone a chance to spin up a system and debug/play with the code.

            • (Score: 2) by fliptop on Monday December 05, @09:54PM (1 child)

              by fliptop (1666) on Monday December 05, @09:54PM (#1281328) Journal

              we cannot provide docker containers with all the personal database data already inside them

              Maybe I'm misunderstanding, but can you not include the data in the container? Just the code and DB schema that runs the site, spin that up, then load the data from backup?

              --
              To be oneself, and unafraid whether right or wrong, is more admirable than the easy cowardice of surrender to conformity
              • (Score: 2) by janrinok on Tuesday December 06, @01:08AM

                by janrinok (52) Subscriber Badge on Tuesday December 06, @01:08AM (#1281347) Journal

                Yes - WE can. That is the whole point of doing this. But we are not going to provide the docker package complete with personal data to other people.

                One of the benefits of the container is that is can be transferred elsewhere and run. If someone thinks that they are going to be able to simply ask for a docker installation and it will arrive complete with data then that is incorrect. They could have the container without any data so that they do not have to build a new system complete with all the same versions of software in order to test or debug software - which is what we have to do today. Several people have tried this path, including myself, and simply found the the effort was not justified.

                However, I will happily take the docker container and create my own fake data in order to do bug squashing or testing changes to the Perl code.

                At least that is my understanding of it.

        • (Score: 2) by PiMuNu on Tuesday December 06, @10:51AM (3 children)

          by PiMuNu (3823) on Tuesday December 06, @10:51AM (#1281379)

          > It's pretty easy, and it'd done by cron.daily scripts

          Implementing backups is usually pretty easy. But unless you regularly test for a disaster recovery, the backup procedure will go mouldy. So it needs someone to regularly check that cron is running the job properly, the backups are going still somewhere sensible and the data is recoverable. That's a waste of an afternoon every few months, so you need someone to volunteer for that (not me).

          • (Score: 2) by RS3 on Tuesday December 06, @06:16PM (2 children)

            by RS3 (6367) on Tuesday December 06, @06:16PM (#1281411)

            Yes, and I do that now for the servers I admin (occasional work- not full-time).

            It takes very little time. Maybe an hour if I'm super thorough, make extra manual copies, etc. But just looking at the directory lists of backup files takes maybe a couple of minutes.

            Oh- almost forgot- "logwatch" and "anacron" logging programs email summaries of the processes and files that were backed up, so that's pretty good indication that all went well. Of course, that is not logically definitive, but so far, it's never failed (14+ years).

            I've rarely needed to do a restore, except one major time when one webserver failed- something in hardware, and in failing somehow the mirrored drive pair (not my build- previous person's) got severely corrupted. Being an efficiency-bound person, I had another machine on standby, ready to go.

            • (Score: 2) by PiMuNu on Wednesday December 07, @08:38AM (1 child)

              by PiMuNu (3823) on Wednesday December 07, @08:38AM (#1281500)

              Okay - once upon a time we had a situation where backups were all made as intended, but due to a misconfiguration/misintention we could not restore from backup when needed. So I would go through the restore procedure.

              (Yes, the sysadmin did screw up. People are fallible, not everyone who is asked to sysadmin in a small outfit has enough experience/knowledge to get it right.)

              • (Score: 2) by RS3 on Wednesday December 07, @06:15PM

                by RS3 (6367) on Wednesday December 07, @06:15PM (#1281573)

                No question, people and things and procedures are imperfect. Restore testing is critically important. Of course, that done onto a copied image of the system.

                I'm curious- what are some of the specifics of the situation you encountered? OS? FS? Backup software, or scripts running things like mysqldump, gzip, tar, ftp, etc.? Media onto which the backups were stored?

  • (Score: 3, Interesting) by inertnet on Monday December 05, @02:16PM (8 children)

    by inertnet (4071) Subscriber Badge on Monday December 05, @02:16PM (#1281253) Journal

    I was wondering about the same thing, even thought about donating space on my NAS for database backups. It would be nicer to have the code updated to include an event store, which can be backed up incrementally and can be used to rebuild the entire DB. But we don't have a Perl programmer who could do that.

    • (Score: 2) by janrinok on Monday December 05, @03:35PM (6 children)

      by janrinok (52) Subscriber Badge on Monday December 05, @03:35PM (#1281266) Journal

      There are problems with off-loading backups to remote locations. We are handling personal data from all over the world and some nations expect (not unreasonably in my view) that it is given adequate protection. A single person's data might not seem important to you but the compromise of a whole database could certainly result in a few legal challenges.

      The major players, such as Linode, AWS etc can easily handle such legal requirements but they become an onerous task for a private individual.

      • (Score: 3, Informative) by RS3 on Monday December 05, @04:50PM (4 children)

        by RS3 (6367) on Monday December 05, @04:50PM (#1281283)

        I have a lot more to say but little time at the moment:

        To OP- you don't need a single line of perl code to backup database:

        /usr/bin/mysqldump -q -e -hlocalhost -u(db username) -p(db username's password) (db name) > (some filename, or process like gzip, and/or an encryption utility then > filename, etc.)

        As you can see, you then encrypt the entire backup. Easy. I have cron.daily scripts doing this. They generate a backup filename based on the db_name and date+time at runtime.

        In all fairness, I did not invent this- someone previous to me, who I don't know, created the backup system in the early 2000s. Then other types of backup systems involving external backup server (just a fileserver), tape / optical / yet another hard disk / paper tape / punched cards (jk!) did their thing.

        • (Score: 2) by janrinok on Monday December 05, @06:14PM (1 child)

          by janrinok (52) Subscriber Badge on Monday December 05, @06:14PM (#1281296) Journal

          I use something similar on my own databases.

          • (Score: 2) by RS3 on Monday December 05, @07:31PM

            by RS3 (6367) on Monday December 05, @07:31PM (#1281308)

            Funny you mention that. I used to use "FoxPro" (dBase similar) for a few personal databases, but I haven't run that software in probably 15 years, and I forget which older system's hard disk has it (maybe one that died!).

            One database I built long ago was simple- my house's circuit breaker panel and various branch circuits. I made several printouts sorted on breaker number, another sorted by floor then room.

            I finally decided to add my own circuit database to one of the MySQL ones I admin. But I neglected (just didn't think of it) to add a backup script for it. No need to keep backing it up- I'll do one manual mysqldump backup and be happy that I don't have to key it in again. Thanks!!

        • (Score: 1) by shrewdsheep on Monday December 05, @07:04PM (1 child)

          by shrewdsheep (5215) on Monday December 05, @07:04PM (#1281301)

          This + using either a relay log or the general query log which should be rotated at the time of the db dump should allow the reconstruction also to points between dumps. I have not worked with mysql and do not know whether the dump is atomic which might be required to make replaying the log-files work.

          • (Score: 3, Interesting) by RS3 on Monday December 05, @07:21PM

            by RS3 (6367) on Monday December 05, @07:21PM (#1281305)

            Yes, really good point. AFAIK, and web search results say MySQL is atomic, so you'd want to back up the 2 (3?) log files.

            It really depends on the time-granularity (resolution) you need. Here, I dunno. It's not a Wall Street high-frequency trading database, so sub-millisecond is probably not necessary.

            It'd be good to do some analysis of various buffer / cache flushing. Big buffers can result in bigger data loss, but too small buffers would bottleneck a busy system (make faster storage system) so some tuning is in order to figure out the compromise, but that's pretty easy.

            Incremental backup is available within MySQL, so I'd look strongly at fairly frequent incremental backups, which again, compressed text is going to be tiny, and you could auto-delete older ones after a full backup is done.

      • (Score: 2) by inertnet on Monday December 05, @09:38PM

        by inertnet (4071) Subscriber Badge on Monday December 05, @09:38PM (#1281325) Journal

        I agree and I don't even want plain data on my NAS. Someone already mentioned encryption, which would be a logical thing to use with remote backups. I don't need encryption keys, the site administrators are the only ones who need to be able to rebuild the database from the encrypted backups.

        It's nice that you have automated Linode backups but apparently nobody knows how to restore them, or maybe they're unusable and you should stop paying for them.

    • (Score: 2) by RS3 on Monday December 05, @05:17PM

      by RS3 (6367) on Monday December 05, @05:17PM (#1281286)

      I responded by responding to janrinok's response.