Stories
Slash Boxes
Comments

SoylentNews is people

posted by NCommander on Wednesday April 09 2014, @07:26PM   Printer-friendly
from the seething-with-anger dept.
I've pushed an emergency fix to production to close bug #142 on the tracker. For those unaware, Slashcode portscans every user when they login or post a comment. While we knew that there was some code involved in checking for open proxies, I thought it had been disabled, and the default settings in the database all default to off. The fact of the matter though is the backend was ignoring all disable checks in the database and scanning every IP to see if they were a proxy on ports 80, 3123, 8000, and 8080.

I'm f****** seething; this is unacceptable for any site, and this behaviour isn't documented anywhere; we've been portscanning since day one and were completely unaware of it. My guess is almost everyone here was unaware of this "feature" as well. Our submitter reports slashdot did this as well. There is no notification or link in the FAQ that this is done, unless you were checking your firewall rules religiously, this would have been completely unnoticed.

I'm seething and furious at the moment. How on earth is this acceptable behaviour? I understand proxy scanning; most IRC networks do it, but they notify you that they are doing so. Furthermore, a basic web application should not be probing their end users; I'm absolutely flabbergasted that this exists, as were most of the staff when it was brought to our attention. On behalf of the site, I want to offer a formal apology for this clusterf***.

Addendum: Since writing this, I've written a follow up on why this got me so upset in my journal. I've got journal replies set to on, and will respond to anyone both here and there.Here's the revelent bit of code from Slash/DB/MySQL/MySQL.pm (yes, it lives in the DB API, no I don't know why)
sub checkForOpenProxy {
my($self, $ip) = @_;
# If we weren't passed an IP address, default to whatever
# the current IP address is.
if (!$ip && $ENV{GATEWAY_INTERFACE}) {
my $r = Apache->request;
$ip = $r->connection->remote_ip if $r;
}

# If we don't have an IP address, it can't be an open proxy.
return 0 if !$ip;
# Known secure IPs also don't count as open proxies.
my $constants = getCurrentStatic();
my $gSkin = getCurrentSkin();

my $secure_ip_regex = $constants->{admin_secure_ip_regex};
return 0 if $secure_ip_regex && $ip =~ /$secure_ip_regex/;

# If the IP address is already one we have listed, use the
# existing listing.
my $port = $self->getKnownOpenProxy($ip);
if (defined $port) {
#print STDERR scalar(localtime) . " cfop no need to check ip '$ip', port is '$port'\n";
return $port;
}
#print STDERR scalar(localtime) . " cfop ip '$ip' not known, checking\n";

# No known answer; probe the IP address and get an answer.
my $ports = $constants->{comments_portscan_ports} || '80 8080 8000 3128';
my @ports = grep /^\d+$/, split / /, $ports;
return 0 if !@ports;
my $timeout = $constants->{comments_portscan_timeout} || 5;
my $connect_timeout = int($timeout/scalar(@ports)+0.2);
my $ok_url = "$gSkin->{absolutedir}/ok.txt";

my $pua = Slash::Custom::ParUserAgent->new();
$pua->redirect(1);
$pua->max_redirect(3);
$pua->max_hosts(scalar(@ports));
$pua->max_req(scalar(@ports));
$pua->timeout($connect_timeout);

#use LWP::Debug;
#use Data::Dumper;
#LWP::Debug::level("+trace"); LWP::Debug::level("+debug");

my $start_time = Time::HiRes::time;

local $_proxy_port = undef;
sub _cfop_callback {
my($data, $response, $protocol) = @_;
#print STDERR scalar(localtime) . " _cfop_callback protocol '$protocol' port '$_proxy_port' succ '" . ($response->is_success()) . "' data '$data' content '" . ($response->is_success() ? $response->content() : "(fail)") . "'\n";
if ($response->is_success() && $data eq "ok\n") {
# We got a success, so the IP is a proxy.
# We should know the proxy's port at this
# point; if not, that's remarkable, so
# print an error.
my $orig_req = $response->request();
$_proxy_port = $orig_req->{_slash_proxytest_port};
if (!$_proxy_port) {
print STDERR scalar(localtime) . " _cfop_callback got data but no port, protocol '$protocol' port '$_proxy_port' succ '" . ($response->is_success()) . "' data '$data' content '" . $response->content() . "'\n";
}
$_proxy_port ||= 1;
# We can quit listening on any of the
# other ports that may have connected,
# returning immediately from the wait().
# So we want to return C_ENDALL. Except
# C_ENDALL doesn't seem to _work_, it
# crashes in _remove_current_connection.
# Argh. So we use C_LASTCON.
return LWP::Parallel::UserAgent::C_LASTCON;
}
#print STDERR scalar(localtime) . " _cfop_callback protocol '$protocol' succ '0'\n";
}

#print STDERR scalar(localtime) . " cfop beginning registering\n";
for my $port (@ports) {
# We switch to a new proxy every time thru.
$pua->proxy('http', "http://$ip:$port/");
my $req = HTTP::Request->new(GET => $ok_url);
$req->{_slash_proxytest_port} = $port;
#print STDERR scalar(localtime) . " cfop registering for proxy '$pua->{proxy}{http}'\n";
$pua->register($req, \&_cfop_callback);
}
#print STDERR scalar(localtime) . "pua: " . Dumper($pua);
my $elapsed = Time::HiRes::time - $start_time;
my $wait_timeout = int($timeout - $elapsed + 0.5);
$wait_timeout = 1 if $wait_timeout wait($wait_timeout);
#print STDERR scalar(localtime) . " cfop done with wait, returning " . (defined $_proxy_port ? 'undef' : "'$port'") . "\n";
$_proxy_port = 0 if !$_proxy_port;
$elapsed = Time::HiRes::time - $start_time;

# Store this value so we don't keep probing the IP.
$self->setKnownOpenProxy($ip, $_proxy_port, $elapsed);

return $_proxy_port;
}


Leave your comments below, I want to know how others feel about this "feature".

Update: We've confirmed that slashdot.jp and Barrapunto predate this feature being added to the codebase; according to the git log, it was added on commit 177e2213 at 2008-04-16 19:07:46 +0000.
 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2, Insightful) by urza9814 on Wednesday April 09 2014, @08:24PM

    by urza9814 (3954) on Wednesday April 09 2014, @08:24PM (#29079) Journal

    I recall hearing about this a couple times in the past on Dice's site. And I agree with some of the other comments saying this isn't a huge deal -- seems there was a legitimate (if misguided) reasoning behind it as spam prevention. I don't think it was GOOD reasoning, and discriminating against proxies annoys me to no end (I browse exclusively through Tor on my phone, which makes many sites totally inaccessible) but I'd hardly call it malicious in any way.

    But your reaction to this is what really made an impression. You've just cemented my loyalty to this site. Great to see you're cleaning this crap up and keeping everyone well informed regardless of what you find. Wish all -- or hell, ANY -- other sites were so respectful and responsible regarding their users! :)

    Starting Score:    1  point
    Moderation   +1  
       Insightful=1, Total=1
    Extra 'Insightful' Modifier   0  

    Total Score:   2  
  • (Score: 1) by datapharmer on Wednesday April 09 2014, @11:36PM

    by datapharmer (2702) on Wednesday April 09 2014, @11:36PM (#29149)

    What is inexcusable isn't that the port scanning was done; why it was done is clear (even if the reasoning was poor). The real tragedy is how awful that code is... they've got a friggin' variable to disable this function and it STILL runs the portscan when disabled. I've dealt with code (at least) this bad before and it isn't fun. Here's to these folks getting the spaghetti mess cleaned up!

    • (Score: 2) by NCommander on Wednesday April 09 2014, @11:46PM

      by NCommander (2) Subscriber Badge <michael@casadevall.pro> on Wednesday April 09 2014, @11:46PM (#29153) Homepage Journal

      The code quality varies from decent, to ok, to crap. A lot of the later stuff falls into the crap category; Firehose and D2 were so badly implemented that even if we had the complete code for it, I would have scrapped it and re-implemented. THe biggest saving point is slash scales nicely (and is known to scale well), *and* at the very least, its got a decent architecture/sane data storage models.

      --
      Still always moving