I've been approached about working on a new privacy policy for SoylentNews and have agreed to do so. This journal is the first step in that process.
SN currently runs on Rehash, which is written in Perl and dates back to Slash 2.0. Many privacy-related considerations in Rehash are dictated by decisions made by the Slashdot admins nearly 25 years ago when they wrote the original code. The age of this code and its dependencies on tools like mod_perl make it nearly unmaintainable, meaning that SN may implement a new code base sooner rather than later. This is a pivotal time to discuss a new privacy policy for SN, an the decisions made now will likely influence the implementation of whichever new code base powers SN in the future.
SN has three primary stakeholders, which are 1) the ownership, 2) the staff, and 3) the community. To be successful, any site policy needs the support of all three of these stakeholders. That means the community needs to be actively engaged in the process.
My first steps will be to solicit input from the SN community and to spend most of my time listening. There are three important questions to discuss:
1) Problems: What privacy-related considerations are important to you, the members of the SN community? What are your concerns? As long as the issues are reasonably relevant to privacy, anything should be on the table here. This includes things like what user data gets stored, how long it is retained, who has access to it, the right to be forgotten, anonymous commenting, and anything that can reasonably be construed as a privacy issue.
2) Process: All three stakeholders must be supportive of any privacy policy for it to be effective. Therefore, once a privacy policy is drafted, we need a process for all three stakeholders to approve this. I anticipate the biggest questions here will be how you, the members of the SN community, get to voice your support or to request amendments to the policy. What process would the community like us to follow for enacting policy? Do all logged-in users get to vote? Does the community elect representatives?
3) Potential Solutions: Once you, the members of the SN community, make your privacy concerns heard, we need potential solutions for those concerns. These solutions will be limited by a few constraints. To allow for robust discussions and make SN a welcoming community, we need the ability to track abuse of the site (e.g., spam comments, sock puppet account creation, gaming the moderation system, etc...) to prevent disruption of the discussions. SN is required to comply with the laws in relevant jurisdictions such as the United States and the state of Delaware. Any solutions have to be practical, given the limited financial and human resources. Working within those constraints, SN policy should go above and beyond what is merely required by law, and to maximize the privacy of the members of the community.
I'll start by posting three journals at least 7-10 days apart to discuss each of these issues. For this journal, I want to focus on the first point, which is what privacy concerns you have, What is important to you, as members of the SN community, and what do we need to address in the new privacy policy? While any discussion of privacy matters is on-topic in this journal, I'd like to try to keep the discussion focused as much as possible on privacy-related problems that we need to address.
There are a few ground rules in this discussion:
1) If you're giving examples of specific privacy concerns, please don't include actual user names or people. Please use hypothetical terms, or use generic names like "person A" and "person B."
2) The new privacy policy is forward looking, meaning that the discussion should focus on how we can be better in the future, and not on holding people responsible for past mistakes or how the existing code is written.
3) Please keep the discussion civil and welcoming. Everyone deserves a chance to participate in this discussion and to be heard. Please keep the discussion constructive and refrain from posting personal attacks. Privacy is for everyone, and that means everyone deserves to be heard. I ask that you please don't try to dominate the discussion or shout other people down, and instead let everyone make their opinions known.
4) Please keep the discussion on-topic. Any privacy-related matters are on-topic, but issues like story selection are beyond the scope of this policy. Let's keep issues like politics out of this discussion, too.
5) Please don't moderate people down unless they're off-topic, trying to dominate the discussion, shouting people down, or posting personal attacks. Even if you disagree with someone else, please don't moderate them down unless they're violating the ground rules for this discussion. I want everyone to be heard.
I pledge that I'll read every comment that you post. My direct input to this discussion will be minimal, and I probably won't post at all except maybe to answer questions or ask for more detail if appropriate. I'm not here to debate with people. I just want to listen to your concerns. Anonymous Cowards are welcome in this discussion, but all comments that I post will be from the dalek account. I have unchecked the "willing to moderate" box in my user preferences, which means that I am not moderating any comments in this discussion. I am just here to listen.
I want to make these discussions as inclusive as possible. That means I intend to allow Anonymous Coward input to all of these journals. In exchange for keeping these discussions open, I ask that you please keep these discussions on track. I will post future journals, but for now, I want to know what your privacy concerns are, and what topics we need to address in the new privacy policy.
(Score: 2) by janrinok on Wednesday May 31, @06:22AM (10 children)
They are not linked. That does not mean that they cannot be determined.
Technically it is expired. It lasts on the server for about 2 weeks but I am told this is not a hard limit but one that is dependent upon usage. So knowing that somebody used an IP address several weeks, months or even years ago is nothing that we could use to identify them. Perhaps LE can - but we cannot.
(Score: -1, Troll) by Anonymous Coward on Wednesday May 31, @10:04PM (9 children)
You already admitted that logged in users have their IP stored indefinitelu, and that any AC comments they make are tied to their account. You should stop misleading people, and you should stop abusing your privilege. Seeing the weird mind games you staffers play is a real education on levels of trust and transparency. You fail pretty hard on both counts, and khallow and runaway appear happy, so you've got that going for you.
(Score: 2) by janrinok on Thursday June 01, @08:02AM (8 children)
No, they have a hash stored indefinitely. The site has no software for reversing the hash - there are no rainbow tables etc. In any case, VPNs, TOR and IPv6 have negated their effectiveness. There are discussions on the web about it.
Wrong again - nowhere have I stated that we can do that in all cases. We cannot link some AC accounts to the actual user. Under certain conditions we can. I am not going to discuss a potential security vulnerability here.
(Score: 0) by Anonymous Coward on Thursday June 01, @09:36PM (7 children)
"Under certain conditions we can. I am not going to discuss a potential security vulnerability here."
Certain conditions: the user is logged in and selects Post as AC when making a comment, or the user is not logged in but using a unique IP. The unique IP bit you have a real problem with since you make incorrect assumptions
Potential security vulnerability: bullshit, you don't want registered users to realize their AC comments are anything but anonymous to site staff
(Score: 1) by dalek on Thursday June 01, @10:07PM (6 children)
Directing antagonistic comments toward staff members isn't furthering the discussion. What specific concerns do you want me to raise in further discussions? Here are a few that come to mind:
1) Should hashed IP addresses be used to distinguish users from each other? Are there better identifiers than hashed IP addresses?
2) How long are identifiers stored in the database? Should they be purged after a certain amount of time? If so, how long?
3) Who can see this information? Is it automatically displayed, or does the person viewing it have to click through to see it? If it's only displayed upon specifically requesting it, is that request logged? If the user is logged in, does the user get a notification that this information was accessed by a staff member?
4) If the identifier is the same between an AC comment and a logged-in comment, it suggests but does not guarantee the same person may have posted both comments. If staff are aware of which logged-in user posted an AC comment, what information are they allowed to post publicly about this? Are they allowed to say that they know who posted a comment, or would that intimidate the person who posted said comment? Are the allowed to initiate private communication (e.g., email) with the person they believe posted the comment? Are they allowed to discuss any details of the comment history, such as suggesting that a comment may have been posted in bad faith?
5) How are staff held accountable if they improperly share information? How are these policies enforced?
These are all things I'm willing to discuss in a forward looking context. There's nothing we can do to change what's happened in the past, so dwelling on that doesn't help anything. Antagonizing staff, regardless of your opinions of specific staff members, does not help either. What topics do you want discussed with respect to the new privacy policy? Are the questions I listed things that you want to discuss? Do you have different questions that you want discussed?
I certainly support asking the tough questions and discussing all of these issues with respect to a future privacy policy. Let's not argue with staff members and instead focus on what issues need to be raised to better respect privacy going forward. I want to continue the discussion but in a way that's productive rather than dwelling on the past.
As Mike Ditka once said, "The past is for cowards... you live in the past, you die in the past." Let's focus on the future.
EXTERMINATE
(Score: -1, Troll) by Anonymous Coward on Thursday June 01, @11:27PM (1 child)
Since when is antagonism opposed to robust privacy? It appears that you are a janrinok sockpuppet, dalek. Prove us wrong.
(Score: 1) by dalek on Friday June 02, @03:39AM
I didn't say it's in opposition. It's tangential. I'm trying to keep the discussion on track.
I replied to an AC's comment and said that we should focus on the privacy issues instead of discussing issues with staff. I have no doubt that the AC is very sincere in having concerns about user privacy on SN. I'm just trying to keep the discussion focused on those privacy issues and away from getting distracted on tangents about staff. That's why I suggested many potential privacy issues that might arise, asked if they were concerns the AC had, and asked if they had anything else I should add to my list. I'm trying to get a list of privacy concerns that we can address later.
Do you have any concerns that you'd like to add to my list? This is not about specific people. It's about future policy for the site.
For example, let's say that I set up my home router to use a VPN for all outgoing connections. If I post a comment to SN, the hash that's recorded will be that of one of the VPN's servers. If I post from an account, that hash is linked to my account. The account might also have an email address that identifies exactly who I am. Let's say that the comment I post is completely legal and absolutely harmless. But if someone else posts an AC comment using the same VPN and server, the same hash will also be recorded. Let's say the other user posts instructions for decrypting content that uses Microsoft's PlayReady DRM and how to download videos from some sites that use PlayReady DRM. That could cause some issues for me if the lawyers for that site decide they want to sue the person who posted those instructions. The situation I'm describing is fairly similar to lawsuits about DeCSS around 20 years ago. If you were around Slashdot in that era, you probably remember that DeCSS was a frequent topic of discussion. Even if staff consider hashes unreliable for linking AC comments to accounts, it doesn't mean that lawyers, courts, or juries would reach the same conclusion.
I support discussing things like hashed IP addresses or other identifiers and how long that information is retained. I just want to keep the discussion focused on privacy concerns and not on the staff.
EXTERMINATE
(Score: 0) by Anonymous Coward on Friday June 02, @02:46AM (3 children)
> 2) How long are identifiers stored in the database? Should they be purged after a certain amount of time? If so, how long?
I'll chime in on this. I'm assuming there is some utility in keeping the identifiers around when a story & comments are live--for example something must be used to block adding more than one mod point (per user) to a post?
However, after an article ages, at some point the comments & mod points are locked, which seems like a good thing, prevents future meddling. Anyone wanting to continue that topic can submit a new story or start a journal.
When the article is locked, then delete the identifiers from the database. What possible future use could they have?
(Score: 2) by janrinok on Friday June 02, @05:15AM
The IPs are not saved any longer than approximately 2 weeks, depending on memory availability in the server I think. In many cases they will point to a VPN, TOR exit node or some other system of privacy/security redirection. However, IP addresses are essential. They are the key to the whole internet, and are vital when blocking some types of attack on the site. They will still exist and be used by this site.
A design decision made over 25 years ago initially linked comments, submissions etc directly with the IP address. Cmdr Taco did not want to work with IP addresses - they were cumbersome in software terms and used more computer processing. The state of computing 25 years ago was very different from today. He decided to use hashes derived from the IP addresses which could then be used directly in the database. He writes about this in the code which you can download and access freely. It was not intended to be a security measure. It is not a tracking measure. It is a form of indexing within the database itself.
Relational databases thrive on hashes. Data that is in the database is linked using those hashes. This cannot be undone simply. Much of the Perl code is written around those hashes, as far as I can tell.
Undoing this design decision would require, I imagine, a complete rewrite of all the Perl code (and we can't even do bug fixes!) along with a reformatting of the data in the database, or scrapping it altogether. If the code is to be rewritten it will NOT be in Perl.
(Score: -1, Troll) by Anonymous Coward on Friday June 02, @08:05PM (1 child)
Janrinok lies or at least attempts to deceive once again!
So either the hashes are tossed after a few weeks or they are kept forever because the relational database needs them. Which is it?
Previous statements were that the IP is always hashed, never stored fully as implied in the first sentences, and true not logged in AC comments were the only ones where the hashed IP gets dropped after two weeks. Every action by a registered user is tracked. Janrinok's intentions do not matter as they could shift, like they did many months ago when he started calling every critical AC aristarchus.
He is still downaying the tracking done by the site, though I accept it was not added maliciously. Why not be up front about these technical details? Only by piecing together the info, then repeatedly pointing out the problems finally dragged enough details from frustrated staff. One staffer admitted there were some strange database flags causing moderation bans, some admissions about how hadhed IPs are stored, including that the hash hadn't changed so IP hadhes since site launch are available for every registered user.
Why does janrinok try and downplay that fact?
(Score: 0) by Anonymous Coward on Monday June 05, @02:07AM
They are playing word games with IP and hashed IP. HONEYPOT?