So, to say the last week has been a dumpster fire is drastically underselling what I've been through. This, combined with having to put things in place to migrate off Twitter, and otherwise deal with all the fallout of that hot mess has, to put it frankly, put free time at something of a premium, hence why this post took so long. For those who missed it, I did fairly long overhaul of our backend, upgrading boxes from Ubuntu 14.04, and rebuilding and replacing others.
At the moment, the site is mostly working, with two exceptions, site search is still down, and IRC is still down. Deucalion has taken up the task of rebuilding the IRCd on modern server software, so it's time to lay down the road going forward past this point.
Read past the fold for more information ...
Right now, the backend is mostly built on an outdated version of mod_perl 2.2, and MySQL cluster, which is very much not a good place to be. Originally as envisioned, I planned this site to be able to be easily scalable, with a larger user base. That's why the infrastructure was designed to be as scalable as it was, with the downside of having a much higher overhead than a more traditional setup has. Furthermore, rehash (the code that powers this site) is, uh, to put it frankly, a beast to work on. It's a 90s era Perl code base and pretty much everything else that implies; if it wasn't for the fact that rehash is one of the main reasons to use SoylentNews, I'd argue it might be time to replace it.
Right now, I'm working on doing another round of server hardening. As it is at the moment, I've got rehash and Apache running in an AppArmor jail, and everything is pretty well sandboxed from everything else, but I still need to go through and adjust a lot of firewalls, and finish decommissioning out a bunch of the boxes. That said, the site is running faster than it has in a long while since a lot of small things got corrected as we went. Sometime this weekend, I'm going to finish adjusting the firewalls to lock it down further, and that should mostly get back to the point where I might have restful sleep again. That being said, there's still a fair bit more to do.
Moving ahead, we need to get off MySQL cluster, and either onto the current mod_perl, or, ideally, FastCGI, to end the Apache dependency entirely. Unfortunately, working on Rehash is quite difficult, and it requires a very specific setup to be viable. My current plan here is to basically get it working in Docker, so its easy to spin up and spin down instances, and return to a less cursed variant of MySQL. This is probably a few hours of work, but I'm hoping that overall it is going to be easy and straightforward to do since most of the backend is fairly well documented at this point. This also leaves me in a decent position to implement a couple of long overdue features, but modernization efforts come first. I'm hoping to livestream my efforts on this on the weeks to come, and I will make stream announcements as I go along.
My intent, based off the policy changes that were made to disallow ACs to post on stories is to sunlight the feature entirely, including in journals and more. The decision to have ACs on SoylentNews was made in 2014, when the Snowden leaks were only a few months old. Furthermore, we've seen from experience that the karma system doesn't go far enough at keeping bad actors from still getting a +2 status. By and large, the numbers underpinning the system need a rework. My general thought is to cap karma at either 10 or 15, and drastically decrease how far into the basement you can go, as well as uncapping posts in moderation to be able to go to -5.
As a rule, incredibly bad takes do get moderated out of existence, but because there's no real penalty for doing so, we get constant shitposts. Time to make this a bit harder to abuse. I've documented the antispam measures on the site before, but the site keeps track of IP addresses and subnets in the form of hashed /24, and /16s (/64 and /48 for IPv6), which has a karma number attached to them. If an IP range goes too far into the basement, it ends up posting at 0 or -1. By adjusting the caps, it should allow this threshold to be reached much more easily, and help bring the signal to noise ratio back to something more "positive".
Furthermore, I believe its generally in the site's interests to allow editors to delete comments. This functionality is actually built into rehash, but has been long disabled. At the time, I felt the community was best self-moderating, but I think on the whole, its better to treat this like a moderated subreddit, and have messages get a notice that they've in-fact been deleted ala reddit. This is a fairly large departure for the site as a whole, but I think one justified given the state of the Internet on 2022. I am open to discussions on all of this, but let me see what all your thoughts are like.
I do intend to keep livestreaming my progress with the site as we go along; and we raised another ~500 dollars towards Trevor Project during the last livestream. I've left that stream unlisted until I've had a chance to finish implementing all the hardening measures I've discussed, but I'm hoping at the end of it, I'll have a pretty good documentary on what it takes to modernize an aging website. As usual, if you want to support me directly: Ko-fi is available for one time donations, or Patreon for a recurring donation.
[ If you are an AC and wish to make a constructive comment, please see my journal. janrinok ]
That could work, although I guess you still need a way to weed out hit and run posters.
That was why I proposed a probationary period. They could have site access but not be given comment posting rights until a certain time limit had passed. The software to control comment posts already exists in the Perl code and is used for certain short term bans.
What I haven't identified is what piece of information does each computer hold which is unique, does not include any personal information, and can be easily transferred during a login session? There are several possibilities but I do not know how feasible they are.
I would not use unique computer information because I've read in the past that some people post from work as well as from home or mobile. When I read your proposal I was imagining a sort of password or phrase for such people to get a unique, but hidden user id.