Stories
Slash Boxes
Comments

SoylentNews is people

Meta
posted by NCommander on Wednesday February 01, @03:47PM   Printer-friendly
from the here's-how-its-going dept.

So it's been awhile since I last wrote, and I'm a bit overdue for a status update. So, let me give you all the short version on what's been going on.

First, I've been doing a lot of backend work to drastically reduce the size of the SoylentNews bill month to month. We had a lot of infrastructure that was either unnecessary, or have gotten so many free tier upgrades that they were being vastly underutilized. Along the way, I've given a lot of fine tuning to bits, although I won't say its been problem free, since we went a few weeks without working sidebars. I'm truly sorry for the delays in getting up and running. My personal life chose to become very exciting in December, and I'm still dealing with the fallout of that entire mess. As such, what I had planned went a bit pear-shaped, and I went unexpectedly radio silent. ...

More past the break ...

The biggest problem is that most of the backend is undocumented. I wrote some documents in the early days of the site, but by and large, the site was mostly maintained by individuals who are no longer active on staff. The internal TechOps wiki was woefully out of date, and even I find myself struggling to know how the entire site is put together. Considering it's been online for over 9 years, and was a bit of a rush job out the gate, well, you know, it happens. I think at some point at the decade mark, I will want to chronicle more about SN's history, but let's first make sure we've got a site when we get there.

By and large, I'm not involved in the day to day operations. janrirok has been, and is, at this point the de facto project leader. My role with SoylentNews these days is kinda vague and undefined, since I stepped down privately in 2020, and then stepped back last November. I also find myself very uncertain if I want to even be involved at all, but, ultimately, I was here at the start, and while SoylentNews was always a collaborative project, I left a mark on both what this site is and will be that has persisted over the better part of a decade.

As such, I feel personally obligated to get SoylentNews to the best shape I can possibly get it, and give it the best chance of success I can give it. However, we're in the uncomfortable situation that we have a dated Perl codebase running on undocumented infrastructure that has been creaking along with no major reworks in almost all that time. You can imagine I've been having a fun time of this. Most of the relevant information mostly exists in my head, since I was the one who got Slashcode running all those years ago.

Right now, my biggest victory is I managed to get us off MySQL Cluster, and onto a more normal version of MySQL which drastically reduces memory and disk load in favor of slower load performance.

Moving forward, the solution is to have a reproducible deployment system, likely based around Docker, or possibly even Kubernetes, with all aspects of rehash (the site software) documented. We use GitHub to handle site development, and I think it would be in our best interests to integrate a full CI pipeline for both development and production environments. While implementing this, I also intend to entirely redo every aspect of the backend, complete with proper documentation, so something beside me can actually maintain it. After that, it will actually be practical for SoylentNews to survive past a single person, and we can have a more serious discussion on what the road forward looks like.

I do realize that the last few months have had a lot of ups and down, mixed with excitement and disappointment. I can't really say for sure where we're going, but you know? I want us to reach that decade mark together, and then we'll figure out where we're going beyond that.

Until next time,

~ N

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 1, Informative) by Anonymous Coward on Friday February 03, @01:51AM (3 children)

    by Anonymous Coward on Friday February 03, @01:51AM (#1289964)

    The statement "SELECT count(j.id), u.nickname, u.uid, MAX(j.date) AS date, MAX(id) FROM journals AS j, users AS u, users_info AS ui WHERE j.uid = u.uid AND j.uid = ui.uid AND ui.karma >= $min_karma GROUP BY u.nickname ORDER BY date DESC LIMIT $limit" is not compliant SQL. To simplify the statement to show the buggy behavior:

    SELECT
      count(a.id),
      b.nickname,
      b.uid
    FROM
      a,
      b
    WHERE
      a.uid = b.uid
    GROUP BY
      b.nickname

    For those more familiar with SQL the bug should be immediately obvious. This query is creating a pivot table from a larger dataset. However, when creating a pivot table, the SELECT statement can only contain columns that give unique values. As a result, the SELECT clause can contain only 2 (or 3) types of columns: columns appearing in the GROUP BY clause, columns whose final call is an aggregate function, and (if you are using an SQL server that supports this part of the standard) "functionally unique" values. Functionally unique values are those that have a "functional dependence" such that the schema, constraints, query's relational algebra force them to be unique. Examples of those are columns that are UNIQUE NOT NULL, columns whose table's primary key appear in the GROUP BY, columns with a single NOT NULL value for all rows, DEFAULT-only columns, WHERE clause limitations, etc.

    So back to the query. The query does not have a SELECT clause that is limited to unique values. Column b.uid does not appear in the GROUP BY. That column is not subject to an aggregate function. That column is also not functionally unique. The column isn't UNIQUE NOT NULL, nor does it's table's primary key appear in the GROUP BY, nor does it have a single NOT NULL value for all rows, nor is it a DEFAULT-only column, nor are there WHERE clause limitations, nor does it fall into any of the other categories of functionally unique values.

    The fix is relatively simple. All you need to do is to decide on some version of that query that is limited to unique values. There are a number of ways to do that. Best part is that because application constraints force a bidirectional dependencies between u.uid and u.nickname (hence why the query works at all), there should be no changes to the rows returned and no side effects other than getting your code one step closer to standard compliance.

    Starting Score:    0  points
    Moderation   +1  
       Informative=1, Total=1
    Extra 'Informative' Modifier   0  

    Total Score:   1  
  • (Score: 3, Insightful) by janrinok on Saturday February 04, @08:07AM (1 child)

    by janrinok (52) Subscriber Badge on Saturday February 04, @08:07AM (#1290202) Journal

    no side effects other than getting your code one step closer to standard compliance

    Thanks for your interesting and informative post. But I had to smile at your closing remark - I have seen numerous bugs that were the results of side effects that nobody anticipated. Simple rule: if you change the code then you have to retest it.

    • (Score: 0) by Anonymous Coward on Sunday February 05, @02:03AM

      by Anonymous Coward on Sunday February 05, @02:03AM (#1290319)

      No matter what when you change code you end up deploying it to your test environment. The real key is to make sure you have a separate production environment you use too.

  • (Score: 0) by Anonymous Coward on Sunday February 05, @02:22AM

    by Anonymous Coward on Sunday February 05, @02:22AM (#1290322)

    The nice thing about declarative languages is that you are specifying what the results are instead of how to get there. The nice thing about DQL is that in itself it is free of side effects. Meaning that the only place for side effects is in the return value. Since the columns are not changed you only have to look at whether the rows are changed. Aggregate functions should return the same data per row as long as the groups are the same, so you can ignore them. Therefore, the only thing that can affect your groups short of a bug in MySQL making it non-compliant is the relationship between uid and nickname. However, the schema provides a UNIQUE constraint on uid, which limits the uid:nickname relationship to either a 1:1 or many:1. As mentioned, the application logic prevents multiple users from having the same nickname, which limits the relationship further to 1:1. The result is that the bugged and fixed should have the same result. Of course that "should" doesn't mean "must" because the underlying assumptions could be wrong in that MySQL has an error in their implementation of the standard, your DBI mutates either the query or model, or you have users with duplicate nicknames. If the query does return differently, there are fewer areas where the problem could be not to mention the fact that you have much deeper problems than just having to change a query. That is the nice thing about declarative languages, the distance between "should" and "must" is reduced. Either it works and returns exactly what you asked for or it doesn't and your problem is somewhere in the model.