Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Tuesday October 15 2019, @10:46AM   Printer-friendly
from the it's-a-secret dept.

A few weeks ago, we tested out three phone platforms (iOS on an iPhone XR, Android on a Nexus 6P, and Linux on the Librem 5 dev kit) to see which one leaks the most data -- and, just as importantly, which leaks the least data.

To do this we connected all three devices to a dedicated wireless router running OpenWrt, and monitored all connections. The phones were then left to sit idle with no applications launched. Both the total number of connections, as well as the amount of data transmitted, were logged. This initial testing was done with the Librem 5 Development Kit, but the results are expected to be the same in the final shipping Librem 5 smartphone.

All three phones were loaded only with stock applications and system settings -- depending on which applications are added (such as Facebook, Twitter, etc.) obviously the results will likely change.

Inspired by research done by Professor Douglas C. Schmidt, Professor of Computer Science at Vanderbilt University, and his team.

PDF here https://digitalcontentnext.org/wp-content/uploads/2018/08/DCN-Google-Data-Collection-Paper.pdf

I. EXECUTIVE SUMMARY
1. Google is the world's largest digital advertising company.1 It also provides the #1 web browser,2 the
#1 mobile platform,3 and the #1 search engine4 worldwide. Google's video platform, email service, and map
application have over 1 billion monthly active users each.5 Google utilizes the tremendous reach of its products
to collect detailed information about people's online and real-world behaviors, which it then uses to target them
with paid advertising. Google's revenues increase significantly as the targeting technology and data are refined.
2. Google collects user data in a variety of ways. The most obvious are "active," with the user directly
and consciously communicating information to Google, as for example by signing in to any of its widely used
applications such as YouTube, Gmail, Search etc. Less obvious ways for Google to collect data are "passive"
means, whereby an application is instrumented to gather information while it's running, possibly without the
user's knowledge. Google's passive data gathering methods arise from platforms (e.g. Android and Chrome),
applications (e.g. Search, YouTube, Maps), publisher tools (e.g. Google Analytics, AdSense) and advertiser tools
(e.g. AdMob, AdWords). The extent and magnitude of Google's passive data collection has largely been
overlooked by past studies on this topic.6
3. To understand what data Google collects, this study draws on four key sources:
a. Google's My Activity7 and Takeout8 tools, which describe information collected during the use of
Google's user-facing products;
b. Data intercepted as it is sent to Google server domains while Google or 3rd-party products are used;
c. Google's privacy policies (both general and product-specific); and
d. Other 3rd-party research that has examined Google's data collection efforts.
4. Through the combined use of above resources, this study provides a unique and comprehensive view
of Google's data collection approaches and delves deeper into specific types of information it collects from
users. This study highlights the following key findings:

a. Google learns a great deal about a user's personal interests during even a single day of typical internet
usage. In an example "day in the life" scenario, where a real user with a new Google account and an
Android phone (with new SIM card) goes through her daily routine, Google collected data at numerous
activity touchpoints, such as user location, routes taken, items purchased, and music listened to.
Surprisingly, Google collected or inferred over two-thirds of the information through passive means.
At the end of the day, Google identified user interests with remarkable accuracy.
b. Android is a key enabler of data collection for Google, with over 2 billion monthly active users
worldwide.9 While the Android OS is used by Original Equipment Manufacturers (OEMs) around the
world, it is tightly connected with Google's ecosystem through Google Play Services. Android helps
Google collect personal user information (e.g. name, mobile phone number, birthdate, zip code, and
in many cases, credit card number), activity on the mobile phone (e.g. apps used, websites visited), and
location coordinates. In the background, Android frequently sends Google user location and devicerelated
information, such as apps usage, crash reports, device configuration, backups, and various
device-related identifiers.
c. The Chrome browser helps Google collect user data from both mobile and desktop devices, with over
2 billion active installs worldwide.
10 The Chrome browser collects personal information (e.g. when a
user completes online forms) and sends it to Google as part of the data synchronization process. It
also tracks webpage visits and sends user location coordinates to Google.
d. Both Android and Chrome send data to Google even in the absence of any user interaction. Our
experiments show that a dormant, stationary Android phone (with Chrome active in the background)
communicated location information to Google 340 times during a 24-hour period, or at an average of
14 data communications per hour. In fact, location information constituted 35% of all the data samples
sent to Google. In contrast, a similar experiment showed that on an iOS Apple device with Safari
(where neither Android nor Chrome were used), Google could not collect any appreciable data
(location or otherwise) in the absence of a user interaction with the device.
e. After a user starts interacting with an Android phone (e.g. moves around, visits webpages, uses apps),
passive communications to Google server domains increase significantly, even in cases where the user
did not use any prominent Google applications (i.e. no Google Search, no YouTube, no Gmail, and
no Google Maps). This increase is driven largely by data activity from Google's publisher and advertiser
products (e.g. Google Analytics, DoubleClick, AdWords)11. Such data constituted 46% of all requests
to Google servers from the Android phone. Google collected location at a 1.4x higher rate compared
to the stationary phone experiment with no user interaction. Magnitude wise, Google's servers
communicated 11.6 MB of data per day (or 0.35 GB/month) with the Android device. This experiment
suggests that even if a user does not interact with any key Google applications, Google is still able to
collect considerable information through its advertiser and publisher products.
f. While using an iOS device, if a user decides to forgo the use of any Google product (i.e. no Android,
no Chrome, no Google applications), and visits only non-Google webpages, the number of times data
is communicated to Google servers still remains surprisingly high. This communication is driven purely
by advertiser/publisher services. The number of times such Google services are called from an iOS
device is similar to an Android device. In this experiment, the total magnitude of data communicated
to Google servers from an iOS device is found to be approximately half of that from the Android
device.
g. Advertising identifiers (which are purportedly "user anonymous" and collect activity data on apps and
3rd-party webpage visits) can get connected with a user's Google identity. This happens via passing of
device-level identification information to Google servers by an Android device. Likewise, the
DoubleClick cookie ID (which tracks a user's activity on the 3rd-party webpages) is another
purportedly "user anonymous" identifier that Google can connect to a user's Google Account if a user
accesses a Google application in the same browser in which a 3rd-party webpage was previously
accessed. Overall, our findings indicate that Google has the ability to connect the anonymous data
collected through passive means with the personal information of the user.

A video graphically displays part of the same information - https://www.youtube.com/watch?v=yHcHi0TBFv4


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 4, Informative) by pkrasimirov on Tuesday October 15 2019, @11:14AM (5 children)

    by pkrasimirov (3358) Subscriber Badge on Tuesday October 15 2019, @11:14AM (#907312)

    The video link is broken, should be https://www.youtube.com/watch?v=yHcHi0TBFv4 [youtube.com]

    • (Score: 2) by takyon on Tuesday October 15 2019, @11:18AM

      by takyon (881) <takyonNO@SPAMsoylentnews.org> on Tuesday October 15 2019, @11:18AM (#907314) Journal

      Thanks.

      --
      [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
    • (Score: 5, Informative) by rigrig on Tuesday October 15 2019, @12:18PM (2 children)

      by rigrig (5129) <soylentnews@tubul.net> on Tuesday October 15 2019, @12:18PM (#907328) Homepage

      It's also not the same information.

      Video: how many data do stock Android/iOS/Librem devices leak in one hour
      Android: 6 connections / 26 KB
      iOS: 39 connections / 175 KB
      Librem: 0 connections / 0 KB

      PDF: a one-year old paper examining Android+iOS, which concludes that

      Google counts a large percentage of the world’s population as its direct customers, with multiple
      products leading their markets globally and many surpassing 1 billion monthly active users. These products are
      able to collect user data through a variety of techniques that may not be easily graspable by a general user. A
      major part of Google’s data collection occurs while a user is not directly engaged with any of its products. The
      magnitude of such collection is significant, especially on Android mobile devices. And while such information
      is typically collected without identifying a unique user, Google distinctively possesses the ability to utilize data
      collected from other sources to de-anonymize such a collection.

      --
      No one remembers the singer.
      • (Score: 0) by Anonymous Coward on Wednesday October 16 2019, @07:32AM (1 child)

        by Anonymous Coward on Wednesday October 16 2019, @07:32AM (#907770)

        This is why all mobile devices need to come with a firewall built in and enabled that can't be bypassed by the manufacturer.

        • (Score: 0) by Anonymous Coward on Wednesday October 16 2019, @07:36AM

          by Anonymous Coward on Wednesday October 16 2019, @07:36AM (#907773)

          Blokada or DSN66

    • (Score: 2) by Runaway1956 on Tuesday October 15 2019, @02:25PM

      by Runaway1956 (2926) Subscriber Badge on Tuesday October 15 2019, @02:25PM (#907368) Journal

      Sorry, and thank you!

  • (Score: 0) by Anonymous Coward on Tuesday October 15 2019, @01:20PM (2 children)

    by Anonymous Coward on Tuesday October 15 2019, @01:20PM (#907350)

    The summary only mentions number of connections and someone else posted a comment about the volume of data (MB). But I'm wondering what was actually in those messages? How many of them are leaking actual user data vs checking for new messages/news/etc.

    They said these were just the default apps but both a Android and Apple set up an email account for you whether you want their e-mail or not (at least it's very hard to avoid their respective e-mail systems). Apple defaults to include news stories in of the the build-in apps. These update periodically.

    but what was actually sent?

    • (Score: 3, Insightful) by hemocyanin on Tuesday October 15 2019, @03:21PM (1 child)

      by hemocyanin (186) on Tuesday October 15 2019, @03:21PM (#907402) Journal

      I would suspect that the transmissions are encrypted so it would probably be pretty tough to know more than the fact of the connection.

      On my phone I've "disabled" (whatever that means, I can't delete them) almost all of the google programs (except for play store and play store services (or something like that)) -- I keep GPS turned off (in software of course so who knows), I use the DDG browser, and I have almost no apps (a calculator I paid for, a PDF viewer I paid for, and a couple other things of an open source nature). All in all, I view phones as poison devices at this point and I do little more with it than make calls and send texts (via Signal). My phone is a $600 brick from which I get $50 of value. I'll probably end up getting a Librem at some point but I've been put off by some of the marketing gobbledygook they've put out which leaves me in an unclear state about what its actual capabilities are regarding turning off sensors. I could get a so called dumb phone, but the reality is, it's still a computer and it is probably even less possible to manipulate the device to my preferences. Maybe I should just quit the damn things all together.

      • (Score: 1, Informative) by Anonymous Coward on Wednesday October 16 2019, @07:30AM

        by Anonymous Coward on Wednesday October 16 2019, @07:30AM (#907768)

        Fdroid has some good programs. They are trustworthy.

  • (Score: 2) by Phoenix666 on Tuesday October 15 2019, @03:23PM

    by Phoenix666 (552) on Tuesday October 15 2019, @03:23PM (#907405) Journal

    The academic paper linked to really only covers Google's data collection. It's useful and interesting to see, but it's not the side-by-side-by-side comparison alluded to in TFS. The video with that comparison is put out by the Librem guys themselves and constitutes more of a promotional piece than a third-party, academic study.

    That said, I'm not discounting Librem's claims out of hand. I hope it is what they say it is, because we have a good idea how invasive Google and Apple are and need an alternative. But trust is in short supply these days.

    I would like to download PureOS first and run it on an old phone and see if it's as good as they say before I plunk down $700 for the Librem 5. Does anyone know if anyone has done that yet?

    --
    Washington DC delenda est.
  • (Score: 0) by Anonymous Coward on Tuesday October 15 2019, @03:55PM

    by Anonymous Coward on Tuesday October 15 2019, @03:55PM (#907414)

    methinks the only thing that makes google special is the "magic" that keeps them from being infected by the endless mountains upon mountains of mundane and useless data.
    one can only hope that even "magic" experiences wear-and-tear and will eventually wear off and then humankind will be left with a gross, foul and oily-gray entity that lives a
    life under a dark and damp bridge -aka- something worse then all "superfunds" combined and compressed into the size of a thimble.

  • (Score: 2) by stretch611 on Wednesday October 16 2019, @12:06AM

    by stretch611 (6199) on Wednesday October 16 2019, @12:06AM (#907640)

    What I hope is not a surprise to anyone with half a brain or more on this site...

    But if you take the easy way (and avoid reading) and use the video link you end up on YouTube. A video site run by the same company named in the spoiler as the worse offender.

    And hopefully another obvious point to most is that the video site will also track you and your habits as much as it possibly can.

    --
    Now with 5 covid vaccine shots/boosters altering my DNA :P
(1)