Stories
Slash Boxes
Comments

SoylentNews is people

Log In

Log In

Create Account  |  Retrieve Password


DISTRIBUTED WEB HOSTING CONCEPT

Posted by crutchy on Wednesday March 12 2014, @01:09PM (#180)
9 Comments
Digital Liberty

An idea that came to mind during talk about Soylent hosting on IRC today.
Thanks to Titanium, prospectacle, stderr, useless, swiss, FoobarBazbot and MrBluze for a lively discussion :-)

It started with:
[19:43] * crutchy wonders if a distributed service could be developed... divide the load, build in redundancy and if anyone's host goes down others will pick up the slack
Time is AEDT

The idea has been developed a little bit further since the IRC discussion.

General System
==============
- independent of DNS
- with a distributed model anyone can volunteer to host a node (no single person relied on to front hosting costs)
- system consists of a network of apache host nodes set up by volunteers willing to cover the costs of their host, and users connect to the web service using their web browser with the remote host selected by a launcher program

Host Node
=========
- not required to access web service (only for those who choose to offer to host)
- apache web server configured for web service (mysql, mod_perl, etc as required)
- must periodically execute a script (using crontab?) that requests nodelist from listed nodes and updates local nodelist as required (adds/removes based on some kind of agreement algorithm)
- must respond to nodelist requests from launchers, but can be isolated php/pl/etc script and need not be built into hosted service
- must contain scripts to synchronize data and site source code updates securely with other nodes as required (this will be the tricky bit)

Launcher
========
- user executes launcher to access web service
- no gui
- executable or source downloaded from trusted location (such as debian repository or github) along with nodelist containing one or more known host IP addresses
- purpose is to select a remote host node and open web browser pointed to selected remote host IP address
- before opening browser, a nodelist request is sent to every host in local nodelist, and local nodelist is updated in same way as server nodelist is updated (see above)
- possibly a simple settings file if required
- could be a bash script for Linux and a small Delphi or C program for Windows

On IRC there was concern expressed about security and verification of host nodes.
Since you're using a browser as a client (and all the security features that come with) and you're only receiving normal http responses (otherwise your browser would throw an error), there's only so much bad stuff a host node can do.
Worst case scenario might be that it redirects to goat.cx or some site with driveby downloads (which most browsers will block anyway).
If required a trusted network of host nodes could be formed using signed certificates (perhaps using OpenSSL).

Nodelists may not be that big since there isn't likely to be a huge number of hosts for the same service (such as SoylentNews) but if need be the list could be gzipped. As mentioned earlier, the tricky bit will be synchronizing website data and service application source code, but I don't think it is an insurmountable challenge.

edit: data could be distributed, but would need to be synchronized on all host nodes (the tricky bit mentioned above)

edit: thinking about data synchronizing... would either require modification of the service application to execute a script when data is changed (and script would do the work of sending data to other hosts) or a shell script with a loop that checks for changes to data file timestamps and if change is detected send data files to other hosts. for max efficiency it would be ideal just to post a single mysql insert/modify query whenever data changes, but that would require integration into the main application (slashcode in Soylent's case). you don't want to be sending entire database files around the place whenever there is a change. a good place to start might be to host the data on one or two high performance 'supernodes' until an improved synch system can be developed.

What makes a good story?

Posted by GungnirSniper on Tuesday March 11 2014, @11:24PM (#178)
6 Comments
Soylent
Our community of Slashdot 'audience' exiles is thriving, but we still need more quality submissions.

There are some general things I try to do when submitting that may be helpful to others:
  • Be neutral and factual in both Subject and Summary. You can wait until the article is posted, or if you must, include your opinion at the end of the Summary.
  • Provide OC - original content. Don't just copy/paste other people's work.
  • Avoid paywalled articles if possible. This is also true for sites that show an advertisement before loading the article.
  • Use primary sources if possible. If a statement is made to NBC News, link to NBC News, not another site that is quoting NBC News.
  • Use at least two source links if possible. This gives readers options and helps insulate against other sites' outages or page removals.
  • For controversial issues, use source links from sites with opposing biases. If a wolf says he wants sheep for dinner, don't ask another wolf if that's a good idea, ask a sheep.
  • If there is a study or deeper link listed in one of articles you are linking to, you should also include that direct link. Some news sites link to study abstracts, and they are primary source material.
  • If the new articles you are linking to reference old articles, you may link those as well to provide background or quotes.
  • Explain acronyms for most things. The first time you name something, spell it out. Then on subsequent uses, use the acronym. With our goal of being a global site, things like the US FAA or British OfCom may not be obvious to those outside those countries.
  • Wikipedia links are a good source of background info and statistics.
  • Check your links are timely. Nothing is worse than warning about something Snopes disproved 5 years ago.
  • When quoting a sentence or less, use quotes. When using more than a sentence, use blockquotes, as this makes the text stand out more.
  • Double-check your story in preview prior to submission, including opening all links. The less an editor has to edit, the more likely your submission is to being approved.

There are also some things to avoid:

  • Don't grumble about rejection of your submission As the site grows, more people will submit the same story. I've also heard there is a 'reason for rejection' system in the works.
  • Avoid unauthorized copying. Don't copy/pasta from Slashdot, that's setting up your editor for failure. Or a mocking on IRC.
  • Avoid links to non-English sources unless you provide a Google Translate link along with the direct, native link.

Someone bought http://www.li694-22.org/ :D

Posted by Yog-Yogguth on Tuesday March 11 2014, @12:36PM (#174)
10 Comments
Soylent

As many might know SoylentNews resides on http://li694-22.members.linode.com/¹ and because of this some people were talking and joking about using li694-22 as a new name. It's a cool name, I was tempted myself! Perhaps an even "weirder" inside joke than http//:/..org :)

No need to be tempted any more; a Mr. Watt (not me!) of Washington bought it and pointed it at SoylentNews¹ :)

¹ naturally your cookies are in different jars

Edit: just to practice safe surfing don't log in through redirection or move your cookies manually or anything like that. Not that I would think anything bad would happen in this case but one would never know until it was too late (maybe Mr. Watt suddenly develops an appetite for collecting low UID accounts).

Thoughts on flight MH370 (Boeing 777)

Posted by Yog-Yogguth on Sunday March 09 2014, @04:51AM (#162)
0 Comments
/dev/random

Since I've moderated and can't be bothered to log out I'll write some thoughts here for my own interest. By no means is this meant to be any kind of complete answer or anything of the sort, just some idle thoughts/speculation.

0.a. It is an entirely unknown failure mode that is sudden and immediately cripples everything. Very unlikely.
0.b. It is an entirely unknown phenomenon that is sudden and immediately cripples everything. Extremely unlikely but not zero.
0.c. A confluence of simultaneous and lasting shoddy operation and systems malfunction in two culturally different countries (Malaysia and Viet Nam). This one is hard to judge; I wouldn't think so on behalf of Viet Nam but they hadn't yet taken airspace control/responsibility for the plane and might not have paid much if any attention to it. Malaysia is fully able to fuck anything up beyond rational belief (*cough* bigoted apartheid-style legislation on the use of a word *cough*) but even so Viet Nam should still have the radar records and be fully able to find anything if there in fact was a more normal disaster.

I guess the simplest ad hoc would be 0.b. with some kind of unusual simultaneous failure of radar range for whatever reason: the signal would then simply disappear giving no clues about anything. If this was caused by some freak meteorological event local to the aircraft it might explain the total lack of everything except debris which might be found later. It might not have to last all that long if the electronics in the plane are knocked out before any remaining related blips on now-functioning radars disappear among the noise. Still extremely unlikely. Inverted clear sky sprite plasma bolts (no such thing is known to exist) or time-space warp bubbles (sorry no link to the paper handy and no such thing is known to exist) or alians!!1 (etc.) or whatever, but who knows.

1. Whether or not some terrorist organization claims responsibility doesn't mean much. Some YKW (You Know Who) organizations claim just about everything or are created solely to claim credit for anything new (like happened for the attacks in Oslo before those claims were discredited) and all it takes for the opposite to happen are a few things:
1.a.1. Whoever did it has discovered and understood the meaning of tactics, and the incident while public in nature is also long term in nature (there are several possibilities here, I'm not comfortable with spelling it out). Somewhat likely.
1.a.2. Whoever responsible simply (and without any deeper thought) doesn't want to draw attention to something that is still ongoing. Fairly likely.

and

1.b.1. For whatever reason(s) the incident fails to trigger knee-jerk claims. Doing something to a flight from a YKW nation to China should naturally avoid most if not all such attention because China is kind of outside the horizon of most YKW despite the recent YKW attacks both in Beijing and western China. Not too unlikely.
1.b.2. Someone figured it was stupid and counterproductive to make bullshit claims and has the clout to stop those who still don't get it. Very unlikely but not impossible.

For a 1.a.2. that passes 1.b.1 it seems very likely that some YKW "Chinese" did this to simply kill as many Chinese as possible. Such YKW "Chinese" aren't known to be big on making public statements of responsibility, in fact they seldom say anything at all (probably it makes them very easy to catch and kill) so that fits.

Oil slicks don't mean much on their own but are often the first thing spotted. If nothing else is spotted (lots of debris floats for a fairly long time) then 0.b. increases.

Sometimes there isn't an answer.

irc logging bot

Posted by crutchy on Wednesday March 05 2014, @11:09AM (#132)
1 Comment
Code

had a go at scripting a little quick & dirty irc bot for soylent

requires sic (http://tools.suckless.org/sic)
if you're using debian: sudo apt-get install sic

#!/bin/bash

chan="#test"
log="test.log"
pipe="log-pipe"

trap "rm -f $pipe" EXIT

if [[ -f $log ]]; then
    rm $log
fi

if [[ ! -p $pipe ]]; then
    mkfifo $pipe
fi

substr="End of /MOTD command"
joined=""

sic -h "irc.sylnt.us" -n "log-bot" <> $pipe | while read line; do
    if [[ -n "$line" ]]; then
        echo $line >> $log
    fi
    if [[ -z "$joined" ]] && [[ -z "${line##*$substr*}" ]]; then
        joined="1"
        echo ":j $chan" > $pipe
    fi
done

exit 0

also posted on the wiki @ http://wiki.soylentnews.org/wiki/index.php/User:Crutchy#IRC_logging_bot

slashdev

Posted by crutchy on Sunday March 02 2014, @12:00PM (#114)
3 Comments
Code

After a minor problem with virtualbox (f*ck you nvidia) I got the slashdev virtual machine going. If you're running a 32-bit host OS (as I do), you can probably still run the 64-bit slashdev VM. You just need to make sure your CPU supports it (Intel VT-x or AMD-V) and that it's enabled in your BIOS (usually disabled by default). GIYF.

When you're importing the vm, gotta make sure you don't hit the checkbox that reassigns mac addressses on network interfaces, cos eth0 won't show up in ifconfig and you won't have internet access.

After a quick flick through the bash history I realised that sudo works with the "slash" user.

sudo apt-get update
sudo apt-get upgrade

sudo apt-get install gnome

*hides* (cli is awesome, but on its own is claustrophobic for me)

login under gnome classic session (default ubuntu session fails to login, not that i mind)

Ephiphany works as a web browser, but I prefer firefox/iceweasel:

sudo apt-get install iceweasel

Can also use synaptic with same password as slash user.

To start apache (compiled per slashcode install instructions, not from repositories), open a terminal:

./apache/bin/apachectl start

Full command is (just for the curious):

/srv/slashdev/apache/bin/apachectl start

Start the slashd (slash daemon) - gleaned from bash history:

sudo /etc/init.d/slash start

Close slashd terminal window (will continue to run in background).

Open Firefox:
http://localhost:1337/

Apache public directory:
/srv/slashdev/slash/themes/slashcode/htdocs/
It contains mostly links to files in the /srv/slashdev/slash/ directory.

It was nice of NCommander to make the slash user home directory as /srv/slashdev... thanks for that

Tried to register a new user but doesn't seem to work. Seemed like maybe MTA not configured. I use exim4 normally on my debian boxen (removes postfix):

sudo apt-get install exim4
sudo dpkg-reconfigure exim4-config

During configuration, mostly self-explanatory (select defaults for all except make sure to select option "internet site; mail is sent and received directly using SMTP"). Tested password retrieval with exim4 ok. As per usual check your junk folder in hotmail etc.

Sagasu is an awesome search tool:

sudo apt-get install sagasu

After install, you'll find it under Application -> Accessories
Change your file pattern to *.pl or whatever (can just use * if you want), select "/srv/slashdev/slash" as your search directory, uncheck match case, enter a search string such as "sub displayComments" and click Search.
Couldn't find sub createEnvironment though (is called at the bottom of a lot of perl files). Anyone got any ideas?

Also recommend installing mysql-workbench.

If anyone finds anything wrong with any of this stuff please let me know.

edit: the other reason why i prefer to install gnome is cos gedit is a great little development tool.

edit: thanks heaps to paulej72 for the git advice. here's the script provided by paulej (i just added the git pull, as also mentioned by paulej):

#!/bin/sh

cd /srv/slashdev/slashcode
git pull
make USER=slash GROUP=slash SLASH_PREFIX=/srv/slashdev/slash install

rm -rf /srv/slashdev/slash/site/slashdev/htdocs/*.css

/srv/slashdev/slash/bin/symlink-tool -U
/srv/slashdev/slash/bin/template-tool -U

/srv/slashdev/apache/bin/apachectl restart

Note: This produced a couple of errors for me. Don't run this under sudo cos the script has a hissy fit (I had to do a "sudo chown slash:slash -R ./slashcode" to recover).
Also, I use this command to execute the script:

bash ./Desktop/deployslash.sh > ./Desktop/deployslash.log

more so that I can have a squiz at what happened if it goes pear shaped.

9-mar-14
paulej72: If you hand install to /srv/slashdev/slash/themes/slashcode/templates/dispComment;misc;default you need to run /srv/slashdev/slash/bin/template-tool -U to update the templates in the database. Should also restart apache when touching the tempates

perl code doc project

Posted by crutchy on Sunday February 23 2014, @12:44PM (#82)
0 Comments
Code

work in progress

a minor difficulty i'm having with wrapping my head around slashcode is figuring out where functions are declared. i can use a search tool like sagasu, but i've done something similar to this for php so i thought it would be a fun perl project.

objective: parse code files in a directory tree and output page with linked index of files and functions

doc.pl

#!/usr/bin/perl
print "Content-Type: text/html\n\n";
use strict;
use warnings;

##########################
sub doc__main {
    print "<!DOCTYPE HTML>\n";
    print "<html>\n";
    print "<head>\n";
    print "<title>Slashcode Doc</title>\n";
    print "<meta name=\"description\" content=\"\">\n";
    print "<meta name=\"keywords\" content=\"\">\n";
    print "<meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\">\n";
    print "</head>\n";
    print "<body>\n";
    print "<p>blah</p>\n";
    print "</body>\n";
    print "</html>\n";
}

##########################
sub doc__functionTree {
    my($structure, $allDeclaredFunctions, $allFunctions, $allFiles) = @_;
}

##########################
sub doc__recurse {
    my($structure, $allDeclaredFunctions, $allFunctions, $allFiles, $allTreeItems, $caption, $type, $level, $id) = @_;
}

##########################
sub doc__aboutFile {
    my($structure, $allFunctions, $allFiles, $fileName) = @_;
}

##########################
sub doc__aboutFunction {
    my($structure, $allFunctions, $allFiles, $functionName) = @_;
}

##########################
sub doc__linkFile {
    my($allFiles, $fileName) = @_;
}

##########################
sub doc__linkFunction {
    my($allFunctions, $functionName) = @_;
}

##########################
sub doc__allFiles {
    my($structure) = @_;
}

##########################
sub doc__allFunctions {
    my($structure) = @_;
}

##########################
sub doc__declaredFunctions {
    my($structure) = @_;
}

##########################
sub doc__loadStructure {
}

##########################
sub doc__parseFile {
    my($structure, $fileName) = @_;
}

##########################
doc__main();
1;

perl

Posted by crutchy on Saturday February 22 2014, @07:24AM (#72)
1 Comment
Code

I'm a perl noob. Hopefully if I do some journal writing on my experience it will help keep me motivated.

Got some sort of perl server configuration going. Google not very helpful since most guides are for mod_perl pre 2.0 and apache foundation docs are jibberish to me (maybe I'm just stupid).

Anyway, here's a conf that I kinda butchered up based on a bunch of different sources:

<VirtualHost *:80>
  ServerName slash
  DocumentRoot /var/www/slash/
  Redirect 404 /favicon.ico
    <Directory />
        Order Deny,Allow
        Deny from all
        Options None
        AllowOverride None
    </Directory>
    <Directory /var/www/slash/>
        SetHandler perl-script
        PerlResponseHandler ModPerl::Registry
        PerlOptions +ParseHeaders
        Options +ExecCGI
        Order Allow,Deny
        Allow from all
    </Directory>
  LogLevel warn
  ErrorLog  /var/www/log/slash/error.log
  CustomLog /var/www/log/slash/access.log combined
</VirtualHost>

By the way, this is for Debian Squeeze.

My first hellow world script was also a bit more of an adventure than expected. Most tutorials leave out a header in examples.

/var/www/slash/test.pl

#!/usr/bin/perl
print "Content-Type: text/html\n\n";
use strict;
use warnings;
print "Hello world.\n";

I could (probably should) have used a text/plain mime header, but it worked nonetheless.
Also I can apparently use the following to add a path to @INC

use lib "/var/www/slash/Slash";

I downloaded the soylent/slashcode master branch from https://github.com/SoylentNews/slashcode/archive/master.zip so that I could have a squiz and see if I could be of any help with debugging etc, but although I can read some of it, I need to go to perl school before I can contribute.

My bread and butter programming languages are Delphi and PHP.

This explains a lot about the beginning of slashcode functions that aren't familiar to me:

http://stackoverflow.com/questions/17151441/perl-function-declaration
Perl does not have type signatures or formal parameters, unlike other languages like C:

// C code
int add(int, int);

int sum = add(1, 2);

int add(int x, int y) {
  return x + y;
}

Instead, the arguments are just passed as a flat list. Any type validation happens inside your code; you'll have to write this manually. You have to unpack the arglist into named variables yourself. And you don't usually predeclare your subroutines:
my $sum = add(1, 2);

sub add {
  my ($x, $y) = @_; # unpack arguments
  return $x + $y;
}

Is it possible to do pass by reference in Perl?
http://www.perlmonks.org/?node_id=6758

Subroutines:
http://perldoc.perl.org/perlsub.html

SNQ*: Squadron of Circus Chickens or Barrel of Rabid Geese?

Posted by Yog-Yogguth on Friday February 21 2014, @08:02PM (#66)
0 Comments
Answers

Betteridge be damned: which is better?.

I think I'll keep the CC Squad as protection and use the BRG as a grenade! I also have a spare Barrel of Uber Robots but I'm not sure what their stats are.

* SNQ is short for "Soylent News Question"

http relay

Posted by crutchy on Thursday February 20 2014, @09:58AM (#58)
3 Comments
Code

Lately I've been working on a little tool to allow remote access to some intranet applications I've been working on. Would be interesting to see what others here thought about the concept.
The applications are normally only accessible on a LAN, with the usual NAT router to the internet.
The aim is to be able to access the applications from the internet without port forwarding in the router.
I've heard of things like BOSH (http://en.wikipedia.org/wiki/BOSH) but haven't found much in the way of specifics and I'm not sure if it does what I want.
The general idea I've been working on is to use a publicly accessible host as a relay between the client (connected to the internet) and the application server (connected to a LAN).
This is kinda how it works at the moment:
To allow remote access, a workstation on the LAN must have open a browser to a URL that uses iframe RPC to periodically poll the relay server. I've set this interval to 3 seconds, which seems OK for testing purposes (would need to be reduced for production). Every 3 seconds the LAN server sends a HTTP request (using php's fsockopen/fwrite/fgets/fclose) and the relay server responds with a list of remote client requests. Most of these responses are empty unless a remote client requests something.
From the remote client perspective, if a user opens their browser to a URL on the relay server, they would normally be presented with some kind of authentication process (I've neglected that for testing purposes) and then they would be able to click a link to access an application that would normally be restricted to the LAN. When they click that link, the relay server creates an empty request file. To respond to the LAN server with a list of requests, the relay server reads the filenames from a directory and contructs the requests list based on files with a certain filename convention (for testing i'm just using "request__0.0.0.0_blah" where 0.0.0.0 is the IP address of the remote client and blah is the raw url encoded request (special chars replaced with % codes).
So one job of the relay server is to maintain a list of remote client request files (including deleting them when the requests have been fulfilled). It would probably be best to use a simple mysql table for this, but for testing I've just used a simple text file in a location that can be written to by apache.
After saving the request, the relay server script instance initiated by the remote client doesn't die, but loops until the request file isn't empty. So while the following is going on, this instance is just looping (although it has a timeout of 5 secs).
After a remote client requests an application from the relay server and the LAN client requests the remote client requests from the relay server (asynchronously, hence the need to use a file or database) the LAN server (through the LAN client iframe and a bit of js) constructs a HTTP request and sends it to the application server (for testing purposes the RPC stub sends the request to its own server, which is processed by the application through a dispatch handler). The application response is returned by fgets call and is processed to modify hyperlinks and img sources etc to suit the relay server instead of the LAN server (still working on this bit for testing) and then posts another request to the relay server with the application page content.
The relay server then takes the page content and saves it to a text file.
The relay server script instance mentioned earlier, that is busy looping away, is checking for the existence of this page content in the request file. I tried doing this check with a call to php's filesize function, but didn't seem to work (thought maybe something to do with the writing and filesize processes being asynch but I don't know) but I found that reading the file using file_get_contents and checking if the content length is greater than zero seemed to work (though not very efficiently I'll admit).
So if the LAN server HTTP request to the relay server containing the application page content gets written to the remote client request file on the relay server, the remote client process on the relay server will read it and output it to the remote client.
If the application page content is output, or the content checking loop times out, the request file is deleted.
Except for link/img targets everything works in testing; I can request a page and it renders on the remote client browser as it would on the LAN (minus images).

Does anyone have any thoughts on this?
The code is fairly simple and short; there's a single routine on the relay server with about 150-odd lines of very sparse code, and there's a single routine on the LAN server with about 100 lines of code (will grow a bit when I get the link/img replacement and get/post param forwarding working, but not much). The application that generates the page content being relayed is thousands of lines of code but I've kept the remote stuff separate.
I'm pretty sure there are dedicated appliances that do this kind of stuff, but does anyone have any experience with them?
There's no doubt other ways to skin this cat, but I'm interested in security, simplicity and of course cost. Aspects that I liked about this approach were that I didn't have to punch a hole in the router and that the process was controllable and monitorable from the client within the LAN (every poll outputs a request status summary).
Would be interesting to find out if you think the idea is good or shit, or if there are aspects that could be improved (no doubt there are plenty). Feel free to comment or not.

Thanks to all those who made SoylentNews a reality!

edit: the setup in this case is a little different from the usual dmz/port forwarding case in that there aren't any ports exposed in the LAN router; i get through because the relay server only ever responds to outbound requests originating from the LAN server. there aren't ever any outbound requests originating from the relay server directly