Stories
Slash Boxes
Comments

SoylentNews is people

posted by mrpg on Saturday December 15 2018, @10:02AM   Printer-friendly
from the echo-chambers-R-us dept.

Measuring the "Filter Bubble": How Google is influencing what you click

Over the years, there has been considerable discussion of Google's "filter bubble" problem. Put simply, it's the manipulation of your search results based on your personal data. In practice this means links are moved up or down or added to your Google search results, necessitating the filtering of other search results altogether. These editorialized results are informed by the personal information Google has on you (like your search, browsing, and purchase history), and puts you in a bubble based on what Google's algorithms think you're most likely to click on.

The filter bubble is particularly pernicious when searching for political topics. That's because undecided and inquisitive voters turn to search engines to conduct basic research on candidates and issues in the critical time when they are forming their opinions on them. If they’re getting information that is swayed to one side because of their personal filter bubbles, then this can have a significant effect on political outcomes in aggregate.

This is a moderately long read, as web pages go. IMO, it's well worth the time.


Original Submission

The code that we wrote to analyze the data is open source and available on our GitHub repository.

https://github.com/duckduckgo/filter-bubble-study

duckduckgo-filter-bubble-study-2018_participants.xls contains the instructions we sent to each participant, as well as basic anonymized data for each participant.

https://duckduckgo.com/download/duckduckgo-filter-bubble-study-2018_participants.xls

duckduckgo-filter-bubble-study-2018_raw-search-results.xls contains a separate sheet for search results per query and per mode (private and non-private). The results are listed as they appeared on the screen for each participant, showing both organic domains and infoboxes such as Top Stories (news), Videos, etc.

https://duckduckgo.com/download/duckduckgo-filter-bubble-study-2018_raw-search-results.xls

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2, Insightful) by Anonymous Coward on Saturday December 15 2018, @03:05PM (1 child)

    by Anonymous Coward on Saturday December 15 2018, @03:05PM (#774773)

    No, they will just make it worse somehow. If you dont want a filter bubble just use duckduckgo or startpage.

    Starting Score:    0  points
    Moderation   +2  
       Insightful=2, Total=2
    Extra 'Insightful' Modifier   0  

    Total Score:   2  
  • (Score: 3, Interesting) by fyngyrz on Saturday December 15 2018, @04:59PM

    by fyngyrz (6567) on Saturday December 15 2018, @04:59PM (#774818) Journal

    If you dont want a filter bubble just use duckduckgo or startpage.

    Well, but there are actually two levels of this, and that will remain so until/unless a search engine arises that can actually determine quality of content.

    The first, the one the TFS is referring to (I did not read TFA, of course), is the one you make over time with your choices. The search engine learns what is likely content for you, and eventually provides it. If DuckDuckGo doesn't do this, then we can put it aside for them.

    But the second is pretty much a given at this point in time: All of these engines use popularity as a key metric in deciding where in the search results a listing goes. (Well, and money... paid links, etc.)

    The problem is, popularity is strongly moderated by marketing, trendiness, current events, and the appeal to the broad population rather than to a critical population. Or in other words, not by actual quality and depth, but by mediocrity, text bites, and easy-to-digest summaries.

    The net result is that the typical listing doesn't give you "best" results. For that, you have to triage the results, often for pages - and even then, you're still dealing with popularity rather than quality.

    I don't think this is likely to be solved in the short term, but I think if you really want the best results you can get, the search engine user needs to keep it firmly in mind with every search made.

    A human-moderated search engine doesn't seem to be in the cards. It's been tried, and in its various forms, has failed to gain long term traction every time. Yahoo's original classed listings of web sites; DMOZ [wikipedia.org]; etc. The problem there is again a human one. If the moderator(s) don't have your views, then their selections can fall pretty far away from your interests. They can also miss things, and be the source of some fairly extreme bias.

    No good solution. Just work the results and don't accept the top results as necessarily being actually... top.

    Personally, I keep a bunch of links around on a web page on my server that I have found to be actually high quality, and I tend to look there first. I've been doing that since the 1990s, so I have a pretty good collection. Wayback machine helps here as well from time to time, as some of the good stuff has simply gone away.

    --
    Ignorance is weakness.