As the SoylentNews site has gone live, I've seen several URLs posted for access to different "areas" of the site as well as to other supporting resources. I'm using this space to collect the SoylentNews links I've found, in no particular order. Some are for historical reference, others for current access/reference.
The following links may be somewhat dated or obsolete:
Alternative URLs listed here were found at the top of http://irc.sylnt.us/
If you are new to IRC, a good place to start is the www.irchelp.org web site!
More #Soylent IRC-related links: NOTE: issue "/msg NickServ help" to get started.
Lately I've been working on a little tool to allow remote access to some intranet applications I've been working on. Would be interesting to see what others here thought about the concept.
The applications are normally only accessible on a LAN, with the usual NAT router to the internet.
The aim is to be able to access the applications from the internet without port forwarding in the router.
I've heard of things like BOSH (http://en.wikipedia.org/wiki/BOSH) but haven't found much in the way of specifics and I'm not sure if it does what I want.
The general idea I've been working on is to use a publicly accessible host as a relay between the client (connected to the internet) and the application server (connected to a LAN).
This is kinda how it works at the moment:
To allow remote access, a workstation on the LAN must have open a browser to a URL that uses iframe RPC to periodically poll the relay server. I've set this interval to 3 seconds, which seems OK for testing purposes (would need to be reduced for production). Every 3 seconds the LAN server sends a HTTP request (using php's fsockopen/fwrite/fgets/fclose) and the relay server responds with a list of remote client requests. Most of these responses are empty unless a remote client requests something.
From the remote client perspective, if a user opens their browser to a URL on the relay server, they would normally be presented with some kind of authentication process (I've neglected that for testing purposes) and then they would be able to click a link to access an application that would normally be restricted to the LAN. When they click that link, the relay server creates an empty request file. To respond to the LAN server with a list of requests, the relay server reads the filenames from a directory and contructs the requests list based on files with a certain filename convention (for testing i'm just using "request__0.0.0.0_blah" where 0.0.0.0 is the IP address of the remote client and blah is the raw url encoded request (special chars replaced with % codes).
So one job of the relay server is to maintain a list of remote client request files (including deleting them when the requests have been fulfilled). It would probably be best to use a simple mysql table for this, but for testing I've just used a simple text file in a location that can be written to by apache.
After saving the request, the relay server script instance initiated by the remote client doesn't die, but loops until the request file isn't empty. So while the following is going on, this instance is just looping (although it has a timeout of 5 secs).
After a remote client requests an application from the relay server and the LAN client requests the remote client requests from the relay server (asynchronously, hence the need to use a file or database) the LAN server (through the LAN client iframe and a bit of js) constructs a HTTP request and sends it to the application server (for testing purposes the RPC stub sends the request to its own server, which is processed by the application through a dispatch handler). The application response is returned by fgets call and is processed to modify hyperlinks and img sources etc to suit the relay server instead of the LAN server (still working on this bit for testing) and then posts another request to the relay server with the application page content.
The relay server then takes the page content and saves it to a text file.
The relay server script instance mentioned earlier, that is busy looping away, is checking for the existence of this page content in the request file. I tried doing this check with a call to php's filesize function, but didn't seem to work (thought maybe something to do with the writing and filesize processes being asynch but I don't know) but I found that reading the file using file_get_contents and checking if the content length is greater than zero seemed to work (though not very efficiently I'll admit).
So if the LAN server HTTP request to the relay server containing the application page content gets written to the remote client request file on the relay server, the remote client process on the relay server will read it and output it to the remote client.
If the application page content is output, or the content checking loop times out, the request file is deleted.
Except for link/img targets everything works in testing; I can request a page and it renders on the remote client browser as it would on the LAN (minus images).
Does anyone have any thoughts on this?
The code is fairly simple and short; there's a single routine on the relay server with about 150-odd lines of very sparse code, and there's a single routine on the LAN server with about 100 lines of code (will grow a bit when I get the link/img replacement and get/post param forwarding working, but not much). The application that generates the page content being relayed is thousands of lines of code but I've kept the remote stuff separate.
I'm pretty sure there are dedicated appliances that do this kind of stuff, but does anyone have any experience with them?
There's no doubt other ways to skin this cat, but I'm interested in security, simplicity and of course cost. Aspects that I liked about this approach were that I didn't have to punch a hole in the router and that the process was controllable and monitorable from the client within the LAN (every poll outputs a request status summary).
Would be interesting to find out if you think the idea is good or shit, or if there are aspects that could be improved (no doubt there are plenty). Feel free to comment or not.
Thanks to all those who made SoylentNews a reality!
edit: the setup in this case is a little different from the usual dmz/port forwarding case in that there aren't any ports exposed in the LAN router; i get through because the relay server only ever responds to outbound requests originating from the LAN server. there aren't ever any outbound requests originating from the relay server directly
So lovely to be back!!! Yes that's how it feels isn't it?
Had two UIDs on that other site, one relatively ancient forgotten one and one mostly unused as I joined the AC horde :)
But now... now home has been rebuilt! Awesome. Way back then I don't think I truly appreciated what was available --and I'm probably not the only one this applies to-- but now that we have lost it we have gained more so this time I'll try to make better use of it.
Not that I'll be prolific or anything like that but I'll scamper about once in a while *crams stuff into 255 char bio*.
FAO any non-UK users who happen to notice this journal entry, please could you comment your location and whether you can access this radio interview?
http://www.bbc.co.uk/programmes/b03vdx7m
I'm just considering submitting a story about it.
Even better, if anyone has more time than me and is willing to submit the story themselves, just comment below and go for it!
The news article based on the interview is http://www.bbc.co.uk/news/science-environment-26014584
So, given my last journal, a writeup on how they work today. For the most part, my original story on this topic is true, but I changed a fair bit since then and now, nor did I go much into the thought process in how it was divined.
In contrast to the original system, the current one wants to keep a specific number of moderation points always in circulation, with the concept that mod points are a constantly moving and fluid item. Moderation simply doesn't work if there isn't enough of the damn things, and having too many wasn't a problem at all (Overrated exists for a reason).
The original idea is we should dynamically generate our pool of modpoints based on our activity levels, so the original implementation of this script took the comment counts for the last 24 hours, with the basic notion that every comment should have the potential to be moderated at least once. This number was multiple by two, and provided our baseline moderation count. Since we were based our mod point count on a 24h window, mod points were set to expire every 24 hours instead of every 72. At this point, I hadn't realized the fundamental problem with the slashcode moderation system; my thoughts were "need lots of mod points", "this is incredibly complex, I can do better". That realization came as I was stripping the old one out of slash.
As part of this, I also changed the eligibility requirements for moderation. Instead of having a specific number of tokens, I wanted only users who were active to get mod points. The ability to retain drive by moderations by lurkers was something worth maintaining, and part of what I suspect makes up the bulk of Slashdot moderations.
I also wanted to avoid the problem of "moderator burnout", or users getting mod points too frequently, and just being turned off from moderation. I know that happened to me on slashdot, and others as well who ignored modpoints (or chose to become ineligible). As such, I wanted there to be a cooldown on how frequently someone can get modpoints.
That being said, I didn't want everyone and their mother being moderators all at once, so I decided that 30% of all active users (defined (at the time) as anyone active within the last 24 hours) who had neutral or better would be eligible for modpoints.
Version 1 was fairly simple. It basically took the comment count for the last 24 hours, multiple by 2, this is the minimum number of modpoints that exist at all times. Take all users who were active in the activity_period, take mod_points_to_issue/(elligable_moderators*.3), and hand out those points equally. As a failsafe, the system will hand out ten mod points minimum (the underlying thought here being that I don't just want to get one or two modpoints; more is better, so lets take Slashdot's 5 and multiple it by 2).
And for the most part it worked. When we were in closed alpha on Thursday, we opened the test site to 100 users to try and test it in something resembling real world logic. And, for the most part it worked, because everyone was very highly active. You might see the mistake with that logic when applied to a production site.
Come go-live. User counts surge through the roof, active users are flowing in (can't believe we hit 1k users in a single day), and the moderation script starts handing modpoints in the thousands. At one point, there was close to 2000 modpoints in circulation at any given time).
For that moment, moderation was working well. Then users started going offlining, and EODing, or worse, users were getting modpoints when they signed off, and not seeing them until they signed in. The script was happy, 30% of users were moderators, but there were a lot of +1s. When I looked at the database, most people who had modpoints hadn't been signed in for hours.
Suddenly in a flash of inspiration, I saw the mistake. Slashdot could get away with handing out users with no activity level because even with 80% of their system being moderators, most people would be inactive at any given time. With our 30%, there simply weren't enough modpoints in the hands of active users.
So, in an attempt to salvage the situation, I did a critical adjustment on how the damn thing works. Activity periods for users was seperated into a new variable, and dropped to 1 hour (then five minutes, so any logged in user has a chance), and process_moderation had its crontab shorted to five minutes (it used to run hourly).
To keep modpoints constantly in circulation, expiration time was dropped to four hours, so only people who are active RIGHT NOW were moderators, especially since our editor team had posted 20 articles that day already. Whenever a user looses his points (via expiration or using them all), their slot is freed up, and a new user immediately gets modpoints.
That change in logic underpins version 2 of this script. Now the minimum count is what we hand out, except in the very rare case that we need more modpoints in circulation, in which case, the active users start getting more and more (up to a cap of 50, then it spills past 30 of users). For the most part, it seems to be working, comment moderation scores are generally going up, but it may still require further tweaking to make it work well. I generally am not seeing as many +3-5s as I like, but its right now a whole hell a lot better than it used to be.
I'm open to any thoughts, criticisms, or whacky ideas relating to how mod points are being dished out. Let me hear them below.
So for the curious (or the morbid), I thought I'd do a bit of a writeup on how modpoints worked in the stock slashcode. To my knowledge, this is how they work on slashdot.org today, and all other Slash sites.That being said, caveat emperor. I'm not QUITE sure I understood the code correctly, and I'm writing this from memory, but if enough people want it, I'll fish the old code out of git, and paste it here.
In stock slashcode, every user has something called a "Token" count, which represents their chances at getting modpoints. The best way to think of tokens is like chances at winning a raffle. Keep this in mind, as it will become relevant in short order. Tokens are (theoretically) generated from various clicks in the site UI,and are granted off some serious voodoo involving magic numbers, and other such insanity.
My best understanding is tokens are only issued after a specific random number of clicks are hit, and are later pulled out of the access log by the process_modertion slashd script. But more on that later. The logic that does this is fairly uncommented perl spread across several perl modules, so its rather hard to keep track off.
Tokens convert to modpoints on a strict ratio (if I remember correctly, its 8 tokens becomes one mod point, so you need at least 40 tokens to be eligible to receive modpoints, stock slash only hands out modpoints in increments of five).
Having tokens is not however enough to make you eligible for mod points, it only represents your chances at getting modpoints. When process_moderate kicked, it would go through and essentially dump the entire user table for users that had tokens, were willing to moderate, was not banned from moderation and within the 80% percentile of oldest accounts. This is where metamoderation comes into play. (note: this was true when metamod existed, firehose replaced it, and I have no idea how the logic (if at all) has been changed to handle that)
For users that had been metamodded, those acted as a weight, either increasing or decreasing your chances at getting modpoints in the system. For moderations that were good, you got additional chances in the index, and the reverse decreased them. It also appears that your individual metamods were (somehow) taken into account, but I haven't quite pierced the logic of that. As the metamod module is broken, I never looked to see how it works in practice.
Now, none of this promises you'll actually GET modpoints. As I said, its a raffle. And its a raffle that includes accounts that been inactive but still have tokens. At this point, random users are choosen to get modpoints, which then converts your tokens to modpoints. If you get picked more than once, then you get another increment of 5.
So far, so good? Right now, you might be asking what's the problem with that. Aside from being perhaps a bit longwinded, there seems to be nothing wrong. The problem isn't that the algorithm is wrong, its fundamentally broken.
If you want a hint, I recommend checking out http://slashdot.jp or http://barrapunto.com/ (which are the only other slash sites still on the net that I know of), and look for +5 comments. Take your time, I'll wait.
The problem comes from what ISN'T in the algorithm; it takes no account into how many modpoints MUST be in circulation. I had the advantage of being a frequent poster on macslash.org while it was still around. In the years I was active on that site, I can count the number of +5 comments I saw on one hand. +4s were almost just as rare.
For a comment from an normal user to get to +5, it needs four seperate people with mod points to vote it up four times, and it needs that many people who 1. have modpoint 2. want to use them 3. want to use them on THAT comment.
That's a lot of freaking ifs. While this site was still in closed-testing, the stock modpoint algorithm ran from Monday to Friday until I ripped it out and replaced it with my version of it. In that entire time, it issued a grand whooping total of 10 modpoints (5 to my dummy account which I use for account testing, I don't remember where the other 5 went). At that point, we were getting about 20 comments per article.
In short, the stock modpoint method is not only broken, it is fundamentally broken, and it only works on Slashdot because their userbase is large enough that it works out of dumb luck. Even then, I question that as a lot of good comments never seem to get to +2 or +3, let alone the higher tears. This is was prompted the rewrite, which I'll document in my next journal.