Stories
Slash Boxes
Comments

SoylentNews is people

posted by mrpg on Saturday December 15 2018, @10:02AM   Printer-friendly
from the echo-chambers-R-us dept.

Measuring the "Filter Bubble": How Google is influencing what you click

Over the years, there has been considerable discussion of Google's "filter bubble" problem. Put simply, it's the manipulation of your search results based on your personal data. In practice this means links are moved up or down or added to your Google search results, necessitating the filtering of other search results altogether. These editorialized results are informed by the personal information Google has on you (like your search, browsing, and purchase history), and puts you in a bubble based on what Google's algorithms think you're most likely to click on.

The filter bubble is particularly pernicious when searching for political topics. That's because undecided and inquisitive voters turn to search engines to conduct basic research on candidates and issues in the critical time when they are forming their opinions on them. If they’re getting information that is swayed to one side because of their personal filter bubbles, then this can have a significant effect on political outcomes in aggregate.

This is a moderately long read, as web pages go. IMO, it's well worth the time.


Original Submission

The code that we wrote to analyze the data is open source and available on our GitHub repository.

https://github.com/duckduckgo/filter-bubble-study

duckduckgo-filter-bubble-study-2018_participants.xls contains the instructions we sent to each participant, as well as basic anonymized data for each participant.

https://duckduckgo.com/download/duckduckgo-filter-bubble-study-2018_participants.xls

duckduckgo-filter-bubble-study-2018_raw-search-results.xls contains a separate sheet for search results per query and per mode (private and non-private). The results are listed as they appeared on the screen for each participant, showing both organic domains and infoboxes such as Top Stories (news), Videos, etc.

https://duckduckgo.com/download/duckduckgo-filter-bubble-study-2018_raw-search-results.xls

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 1, Interesting) by Anonymous Coward on Saturday December 15 2018, @04:12PM (1 child)

    by Anonymous Coward on Saturday December 15 2018, @04:12PM (#774790)

    Google has highly distributed server architecture, and they probably do some hinky stuff with DNS and IP to get faster response times. When a lot of people talk about search order, what are really talking about is probably caching. Their distributed servers probably don't operate off the same database.

    My point here, is that each machine probably has a combination of general knowledge, but also learned knowledge. Which is to say the different machines will learn to respond differently based on what the indevidual machine has be asked prior. This means that results will tend to trend not just personally, but regionally.

    Note that this is probably true with DuckDuckGo as well. So while they may be pissing over the neighbors fence, they aren't exactly saints. Which should be obvious from the fact that the article is comparitive between Google and everybody else, just Google and Google.

    What causes this? Cost mostly. If you home equipment closer to end-user in network distance, you get faster response times and happier users. Note that I say "network distance" because part of the game here is getting around the shennanigans the ISP's play with peering policy.

    So business decisions drive engineering decisions. Engineering decisions drive confirmation bias. This is to be expected if you have some understanding of how the system works, in either a technical or economic way.

    Now the question you should be asking, is: "Why isn't there a feedback control?". And there probably is to some degree. But what there isn't is anything that might end up being turned into an integrated third party RBL (real time blacklist), While they assuredly do some heavy duty filtering for the bad stuff, they don't make that accessible via an API. Why?

    The answer to that should be pretty obvious. While I, as a user would love to just drop a global block on all of CNN/MSNBC/Fox, the inevitable response to that is corporations whispering across the pillow to the back of their congressmans heads. Which is of course unjust considering the contemptable garbage these multinational conglomerates pump into peoples heads. But the point is if you create a filtration toolkit; while most people would use it; the state would quickly appropriate it on behalf of their cronies. Which could reasonably describe what their doing now if you're paying attention.

    This is why ABP (adblock plus), and other such tools have to be client side hooks instead of server side hooks. While server side hooks would be way better, and result in vastly superior and more dynamic personal control over data, the state can't be relied on to keep its fucking mits off it. And so Google, is to some degree suffering on our behalf due to all this criticism.

    Is there a better way? Yes. It starts with full end to end adoption of IPv6, redesigning all of OSI layer 4 to provide for ubiquitous cryptographic anonymous transport of all upper layer traffic, and stripping the shit out of web browsers eliminating all unsolicited client side processing. There have been some efforts in this direction to date, though none yet are integrated enough to be consumer friendly.

    That is a HUGE amount of work. There has been a lot of movement in that direction, but there is very little coordination. The truth is, the next step to fixing the Internet is a moonshot thing. The current politicla beef, (like most political beefs) is just bitches squabling over whats on the table, instead of going out and creating new resources.

    Yes it is broke. I know what I'm doing to help fix it. Do you?

    Starting Score:    0  points
    Moderation   +1  
       Interesting=1, Total=1
    Extra 'Interesting' Modifier   0  

    Total Score:   1  
  • (Score: 0) by Anonymous Coward on Saturday December 15 2018, @09:40PM

    by Anonymous Coward on Saturday December 15 2018, @09:40PM (#774939)

    This isn't about how fast stuff comes to you but how they hoodwinked you to order the wrong thing to begin with.

    Also using Adblock against the internet in 2019 is like using a fly swatter to fend off a B-52 Stratofortress bomber.