Stories
Slash Boxes
Comments

SoylentNews is people

Meta
posted by martyb on Friday December 31 2021, @12:02AM   Printer-friendly
from the Woo-Hoo! dept.

Happy New Year!
As the final hours of 2021 here's wishing everyone a Happy New Year!

In light of the holiday, I am inviting the editorial staff to post stories on a weekend/holiday schedule. Thank you for all your hard work in 2021. Here's wishing for a better year to come! Enjoy!

We did it! [*]
([*] I think).

Current Status:
Thanks to a VERY generous subscription of nearly $1,000, we reached our fundraising goal for the second half of the year THANK YOU!: $4,132.81 on a goal of $3,500.00 (all amounts are estimates):

mysql>  SELECT  SUM(payment_net) AS Net,  100.0 * SUM(payment_net) / 3500.00  AS GoalPercent, MAX(ts), MAX(spid), NOW() FROM subscribe_payments WHERE ts > '2021-06-30' ;
+---------+-------------+---------------------+-----------+---------------------+
| Net     | GoalPercent | MAX(ts)             | MAX(spid) | NOW()               |
+---------+-------------+---------------------+-----------+---------------------+
| 4132.81 | 118.0802857 | 2021-12-30 17:36:36 |      1744 | 2021-12-30 23:45:49 |
+---------+-------------+---------------------+-----------+---------------------+
1 row in set (0.00 sec)

mysql>

And for those of you interested in the details:

mysql> SELECT spid, ts, payment_gross, payment_net, payment_type FROM subscribe_payments WHERE ts > '2021-12-29 22:06:03' AND payment_gross > 0 ORDER BY ts ;
+------+---------------------+---------------+-------------+--------------+
| spid | ts                  | payment_gross | payment_net | payment_type |
+------+---------------------+---------------+-------------+--------------+
| 1728 | 2021-12-29 23:16:21 |         20.00 |       18.81 | user         |
| 1729 | 2021-12-30 00:15:05 |        100.00 |       96.80 | user         |
| 1730 | 2021-12-30 01:08:02 |         20.00 |       19.12 | user         |
| 1731 | 2021-12-30 01:13:58 |         30.00 |       28.01 | user         |
| 1732 | 2021-12-30 01:45:50 |         50.00 |       48.25 | user         |
| 1733 | 2021-12-30 02:35:54 |         40.00 |       38.54 | user         |
| 1734 | 2021-12-30 03:12:48 |         20.00 |       18.81 | user         |
| 1735 | 2021-12-30 04:24:07 |        924.43 |      897.32 | user         |
| 1736 | 2021-12-30 07:05:37 |         20.00 |       18.51 | user         |
| 1737 | 2021-12-30 07:50:05 |         20.00 |       18.51 | gift         |
| 1738 | 2021-12-30 09:23:14 |         20.00 |       19.12 | gift         |
| 1739 | 2021-12-30 12:22:42 |         20.00 |       18.51 | user         |
| 1740 | 2021-12-30 12:24:24 |         20.00 |       18.81 | user         |
| 1741 | 2021-12-30 13:59:52 |         40.00 |       38.11 | user         |
| 1742 | 2021-12-30 17:33:36 |         20.00 |       19.12 | gift         |
| 1743 | 2021-12-30 17:35:13 |         20.00 |       19.12 | gift         |
| 1744 | 2021-12-30 17:36:36 |         20.00 |       19.12 | gift         |
+------+---------------------+---------------+-------------+--------------+
17 rows in set (0.00 sec)

mysql>

That's great news! So why the equivocation?

Looking Closer:
Actually, it's more of a stepping back to look at things over the course of the entire year:

mysql> SELECT SUM(payment_gross) AS Gross, SUM(payment_net) AS Net, ts, max(spid) AS SPID FROM subscribe_payments WHERE ts > '2020-12-31' ;
+---------+---------+---------------------+------+
| Gross   | Net     | ts                  | SPID |
+---------+---------+---------------------+------+
| 6916.61 | 6611.75 | 2020-12-31 21:47:25 | 1744 |
+---------+---------+---------------------+------+
1 row in set (0.00 sec)

mysql>

The fundraising goal for the first half of the year was also $3,500.00. So... (2 x $3,500.00) is $7,000.00 but we have a total of... $6,916.61?

The Crash:
And then I remembered. Early this year we had a server (fluorine) crash. We had backups (yay!), but they were borken (Boo! Hiss!). We lost over a day's worth of activity, including a number of subscriptions. I *was* able to manually reconstruct people's subscriptions (time) based on information displayed on a window I just happened to have open at the time. But that was in a table separate from what is used to generate these numbers. After 3 days' effort, I'd patched things up as well as I could. Thankfully the official numbers (on which income and taxes are calculated) are kept on a completely separate server. Whew! One that I DO NOT have access. I'd concluded that we'll just have to sort things out at the end of the year. And that time has draw nigh.

tl;dr:
We're probably all set for the year, but there is also the matter that (unknown to me) we had previously been running at a deficit for a couple years. So anything additional you can contribute will go to replenish our funding base. (NCommander and Matt_ each put up $5,000.00 of their own money that to get us started.)

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by hubie on Friday December 31 2021, @11:50PM (2 children)

    by hubie (1068) on Friday December 31 2021, @11:50PM (#1209061) Journal

    I'm rather impressed with the submissions they come up with. Has there ever been a description of how they work?

    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 5, Informative) by janrinok on Saturday January 01 2022, @05:23AM (1 child)

    by janrinok (52) Subscriber Badge on Saturday January 01 2022, @05:23AM (#1209099) Journal

    I wrote Arthur to scratch my own itch. It is written in Python 3 as a project to help learn this language.

    The quick view from 1000 miles high is that Arthur starts by searching through SN's collated RSS feed (IRC #rss-bot) which is available to anyone. It does this several times during the day. It attempts to parse every story that it finds that hasn't been parsed previously. Stories that are successfully parsed (i.e. not behind a paywall or just a mass of javascript) are then analysed extracting the story itself, and title, publishing date, actual URL, citations, DOIs etc. It also carries out a line and word count which can be useful to an editor. This is the meta data. The analysis can use either the libxml library (which is available in many languages) or BeautifulSoup [crummy.com]. BeautifulSoup is a package designed to facilitate the parsing of XML/HTML.

    The parsing process results in an HTML file that has had all of the crud removed (no javascript, no CSS, no advertising, etc) - just the plain story. It also extracts internal links, ensures that they are corrected if they are abbreviated links, and does its best to exclude many of the links that publishers bury in their web pages that have nothing to do with the actual story. It is not perfect but it gives the editor a reasonably clean piece of HTML which is ready for editing. There is still a lot for the editor to do - but it is a better starting point than just a URL.

    The stories it collects are then stored on disk, sorted by publishing date and, optionally, by source too. On a typical weekday it parses around 200-400 stories. The 'problem' is that bots cannot tell a good story from a bad one and it still requires a person to select the stories that (s)he wishes to submit. If they are not selected by a real person then they go no further.

    The submission process can use either the built-in SN API [soylentnews.org] software or a cut-and-paste to a normal submission page. An average of 50% or more of the stories that it finds each day are potentially acceptable and of interest to our site.

    There are over 30 copies of Arthur now out there. I'm sure that they have been modified and new ways of using the program's output have been devised. I have just started updating it to version 6.

    Upstart was written by chromas [soylentnews.org] and is an IRC bot which essentially does exactly the same task but in a very different way. I know very little about the internals of the project other than it is written in PHP.

    --
    I am not interested in knowing who people are or where they live. My interest starts and stops at our servers.
    • (Score: 2) by hubie on Saturday January 01 2022, @06:02PM

      by hubie (1068) on Saturday January 01 2022, @06:02PM (#1209165) Journal

      Thank you, that is very interesting. I've been fascinated with automated text analysis since I saw the first email spam classifiers from the late 90s, since they seemed to work very well. The software libraries to do that sort of thing are very common now and I was curious whether these bots did any kind of analysis like that. Playing around with that kind of thing has always been on my hobby to-do list, but it never has climbed up high enough in priority for me to take a stab at it.