Stories
Slash Boxes
Comments

SoylentNews is people

posted by hubie on Wednesday June 01 2022, @11:23PM   Printer-friendly

Excellent Utilities: Whoogle Search - self-hosted metasearch engine:

Google has a firm grip on the desktop. Their products and services are ubiquitous. Don't get us wrong, we're long-standing admirers of many of Google's products and services. They are often high quality, easy to use, and 'free', but there can be downsides of over-reliance on a specific company. For example, there are concerns about their privacy policies, business practices, and an almost insatiable desire to control all of our data, all of the time.

What if you are looking to move away from Google and embark on a new world of online freedom, where you are not constantly tracked, monetised and attached to Google's ecosystem.

Whoogle Search is a privacy-focused search engine. It displays the same results as Google Search but without ads/sponsored content, JavaScript, cookies, or tracking.

[...] You can deploy it to PaaS hosting solutions such as Heroku, Fly.io, or Repl.it and lots of other platforms. Or you may choose to install it to a local machine on your network.

Website: github.com/benbusby/whoogle-search
Developer: Ben Busby
License: MIT License

At the bottom of the github page is a short list of public instances where you can try it out.

Anyone use this or searX for web searching, and if so, what are your experiences/recommendations?


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 5, Informative) by Runaway1956 on Thursday June 02 2022, @12:17AM (7 children)

    by Runaway1956 (2926) Subscriber Badge on Thursday June 02 2022, @12:17AM (#1249568) Homepage Journal

    This is a "more private" Google search. Google still knows your IP address, and, presumably, can extrapolate your identity from it. But, my testing shows 0 ads. That's a big fat Zero.

    With all my ad blocking, etc, I typically see links marked "advertisement" at the head of the list of hits. I don't see ads anywhere else, only in the list of links. With Whoogle, those links are gone, and the links list starts with more relevant results.

    I've not done any actual testing, so I can't promise that it doesn't call home or anything like that. But without a doubt, the search is cleaner, marginally faster*, and I don't see any obvious sign of tracking, javascript, or any of the other bad stuff the author wanted to get rid of.

    It looks like Whoogle is what Google meant to be, years ago. But, make no mistake, you are still using Google's search engine.

    For those interested, I did the manual install on a local Linux machine, which is working great for all the machines on the network.

    F) Manual
    Note: Content-Security-Policy headers can be sent by Whoogle if you set WHOOGLE_CSP.

    Clone the repo and run the following commands to start the app in a local-only environment:

    git clone https://github.com/benbusby/whoogle-search.git [github.com]
    cd whoogle-search
    python3 -m venv venv
    source venv/bin/activate
    pip install -r requirements.txt
    ./run

    Pip installed a list of requirements - you can them them out in the requirements.txt file.

    * "marginally faster" because less data is downloaded when the ads and javascript etc are stripped away - the search is probably not one nanosecond faster

    --
    Abortion is the number one killed of children in the United States.
    • (Score: 2, Interesting) by Anonymous Coward on Thursday June 02 2022, @12:47AM (1 child)

      by Anonymous Coward on Thursday June 02 2022, @12:47AM (#1249576)

      could it be installed directly on an openwrt router ?

      • (Score: 2) by lentilla on Thursday June 02 2022, @08:01AM

        by lentilla (1770) on Thursday June 02 2022, @08:01AM (#1249659)

        Assuming you have enough resources (disk space, CPU and RAM), yes it could be installed directly on an OpenWRT installation.

        Realistically, if your OpenWRT device is a re-purposed Internet router it won't be fun. You will also have a fight on your hands with the dependencies - OpenWRT ships with minimal software to reduce its attack surface.

    • (Score: 3, Interesting) by Anonymous Coward on Thursday June 02 2022, @12:59AM (1 child)

      by Anonymous Coward on Thursday June 02 2022, @12:59AM (#1249578)

      Google still knows your IP address, and, presumably, can extrapolate your identity from it.

      if you run the docker image (like heck am I just running that code on a bare box), it appears as if it's routing requests to google via Tor so I don't think that searching reveals your IP to Google


      [notice] Tor 0.4.6.9 running on Linux with Libevent 2.1.12-stable, OpenSSL 1.1.1o, Zlib 1.2.12, Liblzma 5.2.5, Libzstd 1.5.0 and Unknown N/A as libc.
      [notice] Tor can't help you if you use it wrong! Learn how to be safe at https://www.torproject.org/download/download#warning [torproject.org]
      [notice] Read configuration file "/etc/tor/torrc".
      [notice] Opening Socks listener on 127.0.0.1:9050
      [notice] Opened Socks listener connection (ready) on 127.0.0.1:9050
      [notice] Opening Control listener on 127.0.0.1:9051
      [notice] Opened Control listener connection (ready) on 127.0.0.1:9051
      [warn] Fixing permissions on directory /var/lib/tor
      [notice] Parsing GEOIP IPv4 file /usr/share/tor/geoip.
      [notice] Parsing GEOIP IPv6 file /usr/share/tor/geoip6.
      [notice] Bootstrapped 0% (starting): Starting
      [notice] Starting with guard context "default"
      [notice] New control connection opened from 127.0.0.1.
      [notice] Heartbeat: Tor's uptime is 0:00 hours, with 0 circuits open. I've sent 0 kB and received 0 kB. I've received 1 connections on IPv4 and 0 on IPv6. I've made 0 connections with IPv4 and 0 with IPv6.
      [notice] Bootstrapped 5% (conn): Connecting to a relay
      [notice] Bootstrapped 10% (conn_done): Connected to a relay
      [notice] Bootstrapped 14% (handshake): Handshaking with a relay
      [notice] Bootstrapped 15% (handshake_done): Handshake with a relay done
      [notice] Bootstrapped 20% (onehop_create): Establishing an encrypted directory connection
      [notice] Bootstrapped 25% (requesting_status): Asking for networkstatus consensus
      [notice] Bootstrapped 30% (loading_status): Loading networkstatus consensus
      [notice] I learned some more directory information, but not enough to build a circuit: We have no usable consensus.
      [notice] Bootstrapped 40% (loading_keys): Loading authority key certs
      [notice] The current consensus has no exit nodes. Tor can only build internal paths, such as paths to onion services.
      [notice] Bootstrapped 45% (requesting_descriptors): Asking for relay descriptors
      [notice] I learned some more directory information, but not enough to build a circuit: We need more microdescriptors: we have 0/7174, and can only build 0% of likely paths. (We have 0% of guards bw, 0% of midpoint bw, and 0% of end bw (no exits in consensus, using mid) = 0% of path bw.)
      ddg bangs json
      [notice] Bootstrapped 50% (loading_descriptors): Loading relay descriptors
      [notice] The current consensus contains exit nodes. Tor can build exit and internal paths.
      [notice] Bootstrapped 55% (loading_descriptors): Loading relay descriptors
      [notice] Bootstrapped 61% (loading_descriptors): Loading relay descriptors
      [notice] Bootstrapped 69% (loading_descriptors): Loading relay descriptors
      [notice] Bootstrapped 75% (enough_dirinfo): Loaded enough directory info to build circuits
      [notice] Bootstrapped 80% (ap_conn): Connecting to a relay to build circuits
      [notice] Bootstrapped 85% (ap_conn_done): Connected to a relay to build circuits
      [notice] Bootstrapped 89% (ap_handshake): Finishing handshake with a relay to build circuits
      [notice] Bootstrapped 90% (ap_handshake_done): Handshake finished with a relay to build circuits
      [notice] Bootstrapped 95% (circuit_create): Establishing a Tor circuit
      [notice] Bootstrapped 100% (done): Done
      [notice] New control connection opened from 127.0.0.1.
      [notice] Heartbeat: Tor's uptime is 0:00 hours, with 9 circuits open. I've sent 482 kB and received 4.39 MB. I've received 2 connections on IPv4 and 0 on IPv6. I've made 6 connections with IPv4 and 0 with IPv6.

      I haven't dug into it more and so this may be wrong but it at least appears as if _some_ tor stuff is happening...

      • (Score: 0) by Anonymous Coward on Thursday June 02 2022, @06:06AM

        by Anonymous Coward on Thursday June 02 2022, @06:06AM (#1249639)

        Looks like it's running a Tor server.

        A Tor server can route DNS requests over Tor as well.

        Edit torrc to add/enable:

        DNSPort 53
        ServerDNSResolvConfFile /etc/tor/resolv.conf

        Have that resolv.conf contain: nameserver 127.0.0.1

        Small downside is Tor only supports UDP DNS requests.

    • (Score: 3, Interesting) by Booga1 on Thursday June 02 2022, @02:03AM (2 children)

      by Booga1 (6333) on Thursday June 02 2022, @02:03AM (#1249583)

      My question is: Does it disable all the tracking links?
      I'm talking about the links that show you the target URL when you hover over a link but as soon as you click or right click on it, the link is changed to the Google tracking link that redirects your traffic so Google can monitor who clicked it.

  • (Score: 2, Disagree) by Rosco P. Coltrane on Thursday June 02 2022, @01:57AM (1 child)

    by Rosco P. Coltrane (4757) on Thursday June 02 2022, @01:57AM (#1249581)

    Yes? Then they track you too.

    When it's free, you are the product. How else would they turn a profit?

    • (Score: 3, Insightful) by Runaway1956 on Thursday June 02 2022, @02:33AM

      by Runaway1956 (2926) Subscriber Badge on Thursday June 02 2022, @02:33AM (#1249592) Homepage Journal

      Whoogle runs on my own machine - so - who is tracking me? Downloaded from github, the source code is available. I'm pretty illiterate when it comes to code, but, anyone who is literate can read it. If it is tracking, I'd like to know it, so that I can recommend that people don't use it.

      --
      Abortion is the number one killed of children in the United States.
  • (Score: 0) by Anonymous Coward on Thursday June 02 2022, @03:36AM (3 children)

    by Anonymous Coward on Thursday June 02 2022, @03:36AM (#1249610)

    The main reason I go to google for search is, with javascript and cookies disabled, google shows tabs for news, image, video, etc. whereas DDG with javascript and cookie disabled only shows the composite search results. If DDG remedy this deficiency, I would rarely go to google.

    • (Score: 0) by Anonymous Coward on Thursday June 02 2022, @03:59AM (2 children)

      by Anonymous Coward on Thursday June 02 2022, @03:59AM (#1249617)

      https://www.searchenginejournal.com/duckduckgo-microsoft-trackers/452006/ [searchenginejournal.com]

      DuckDuckGo died. Use Metager, Brave Search, or something else.

      • (Score: 3, Informative) by number11 on Thursday June 02 2022, @06:14PM (1 child)

        by number11 (1170) Subscriber Badge on Thursday June 02 2022, @06:14PM (#1249866)

        DDG browser leaks to MS. Yes. I had stopped using it because I liked a different browser better. There are a bunch of other browsers available.

        But is there any evidence that the DDG search engine leaks? The linked article says

        The company continues to promise protection from data trackers when conducting search queries on DuckDuckGo.com.

        In any case, that seems irrelevant to this Whoogle thing.

        • (Score: 2) by bart9h on Friday June 03 2022, @08:25PM

          by bart9h (767) on Friday June 03 2022, @08:25PM (#1250339)

          DDG browser leaks, not the DDG search page.

          Right?

  • (Score: 4, Insightful) by Anonymous Coward on Thursday June 02 2022, @03:57AM (5 children)

    by Anonymous Coward on Thursday June 02 2022, @03:57AM (#1249615)

    > Docker

    Stopped reading there. It's the VM equivalent of the 'Java Trap', and their tech is only being given away relatively-freely for now as they build up to be the dominant 'GitHub' of VM. The company's thumbscrews will get turned the more monopoly the Freedom Fighters hand them. ffs, it's like RMS was never born.

    • (Score: 2) by maxwell demon on Thursday June 02 2022, @08:11AM (4 children)

      by maxwell demon (1608) Subscriber Badge on Thursday June 02 2022, @08:11AM (#1249662) Journal

      Could you please elaborate? I know next to nothing about Docker, but my impression was that it essentially gives you a clean environment where the software is installed. So what stops you from just moving the software from the Docker image to either a traditional VM or a separate machine if you want to get away from Docker?

      --
      The Tao of math: The numbers you can count are not the real numbers.
      • (Score: -1, Troll) by Anonymous Coward on Thursday June 02 2022, @08:45AM (1 child)

        by Anonymous Coward on Thursday June 02 2022, @08:45AM (#1249668)

        Why? If you cared, you'd be motivated to find out for yourself, and compare my statements of how Docker-the-Company operates against how RMS framed the original 'Java Trap' discussion, and base an informed opinion from it.

        There's no such thing as zero-effort McEnlightenment.

        • (Score: 1, Insightful) by Anonymous Coward on Thursday June 02 2022, @12:25PM

          by Anonymous Coward on Thursday June 02 2022, @12:25PM (#1249713)

          Ok, fine, you don't know.

      • (Score: 0) by Anonymous Coward on Thursday June 02 2022, @02:17PM

        by Anonymous Coward on Thursday June 02 2022, @02:17PM (#1249753)

        I think there are a number of Docker alternativs. Podman, LXD, ...

      • (Score: 3, Interesting) by richtopia on Thursday June 02 2022, @09:31PM

        by richtopia (3160) Subscriber Badge on Thursday June 02 2022, @09:31PM (#1249962) Homepage Journal

        I'm not the original poster but I can comment on the distaste for Docker: they've been reigning in the free use of their proprietary services. The github comparison is appropriate; they don't violate anything open source but they are moving to requiring logging into their website for more and more activities.

        I still use docker - it is well architected and works well. I am interested in migrating away if I could find something easier to use, however docker is so popular you can find a container for most services pre-built.

  • (Score: 2) by lentilla on Thursday June 02 2022, @08:11AM (8 children)

    by lentilla (1770) on Thursday June 02 2022, @08:11AM (#1249661)

    I understand the intense dislike that people have for advertisements and tracking - but isn't that the deal? I search using Google, Google tracks me and advertises to me.

    Now, I personally restrict my web browser. That's me - just one person. All the geeks in the world add up to a tiny percentage of the world's browsing population. Now it's one thing for me to "backdoor" the system - I'm a geek and that's my playground. Mechanics get awesome cars, accountants can identify good investments, and workers in chocolate factories get free candy bars. It's quite another thing for me to write software that provides the backdoor to all and sundry.

    • (Score: 5, Interesting) by maxwell demon on Thursday June 02 2022, @08:20AM (2 children)

      by maxwell demon (1608) Subscriber Badge on Thursday June 02 2022, @08:20AM (#1249663) Journal

      Actually the ads on search are purely textual and clearly marked, therefore despite usually being quite annoyed by ads, I wouldn't mind those (well, unless Google has changed those ads since I last had seen them, which is quite some time ago). However the tracking is not acceptable to me, and the main reason I avoid Google search.

      And no, nobody can convince me that tracking is necessary for ads. Ads worked fine on TV already back when TVs were pure display devices that were technically incapable of sending back any data. Ads worked fine for magazines and newspapers without tracking ability. Ads worked fines on billboards without tracking. Ads in cinemas didn't track the viewers either. So why should tracking be necessary online?

      --
      The Tao of math: The numbers you can count are not the real numbers.
      • (Score: 3, Interesting) by lentilla on Thursday June 02 2022, @09:00AM (1 child)

        by lentilla (1770) on Thursday June 02 2022, @09:00AM (#1249672)

        I agree with you, but you didn't address my ethical conundrum.

        What we have here is a tool that (I guess) scrapes Google web searches, and supplies the results without the "special sauce". This allows the great unwashed to use Google resources without giving Google what it asks in return. It is (much as I hate the word) - stealing.

        As I argued above, it might be one thing for tech-savvy people to bypass the controls, but quite another thing to provide a tool that does it for everybody else.

        If everyone starts using this tool, then Google will up the ante. Ethically speaking, if we don't want to be tracked, then we collectively need to say no to tracking - which means we must deprive ourselves of the benefits that come from being tracked. If we don't agree with being tracked, the correct response is to eschew Google, not to bypass the tracking.

        • (Score: 0) by Anonymous Coward on Thursday June 02 2022, @11:03AM

          by Anonymous Coward on Thursday June 02 2022, @11:03AM (#1249687)

          In the real world, that business model was made illegal. You can't give people things and then later on demand payment.

    • (Score: 3, Interesting) by Runaway1956 on Thursday June 02 2022, @08:34AM (4 children)

      by Runaway1956 (2926) Subscriber Badge on Thursday June 02 2022, @08:34AM (#1249666) Homepage Journal

      but isn't that the deal?

      Few of us ever consented to that deal. Those who consent are probably making less than informed consent. That is the deal that has been imposed upon us. A deal isn't really a deal, unless all parties give their informed consent.

      Some history, from one man's perspective:

      Remember television? I can recall a fairly long lifetime of TV commercials. Perhaps 6 minutes of advertising per hour, when I was a small child. That resulted in 1/2 hour time slots showing about 27 minutes of TV show introduction, show, and credits, and 1 hour time slots showing about 54 minutes of the same. As time passed, the advertising exceeded 15 minutes every hour. I could go searching for the actual times, and how the advertising time grew over the years. I'm too lazy to do so.

      Radio programming, IMO, was always worse - it seems that after every 3-minute song, you listened to 3 minutes of doublemint jingles, etc.

      Advertisers moved onto the internet, demanding the same sort of accomodations. For every minute of internet time, they demand that you are exposed to multiple advertising from multiple sources.

      As I alluded to in a previous post, I may be browsing news sites. The news, in text form, may be a few hundred kb, but to read that news, I also have to download megabytes of crap. The bandwidth used for advertising exceeds the bandwidth of my desired content by orders of magnitude.

      If that is the deal, then the deal sucks.

      Millions of us seek to alter that deal.

      --
      Abortion is the number one killed of children in the United States.
      • (Score: 3, Insightful) by lentilla on Thursday June 02 2022, @09:10AM (1 child)

        by lentilla (1770) on Thursday June 02 2022, @09:10AM (#1249674)

        As in my reply to maxwell demon above [soylentnews.org], I agree with you. Like I suggested above: "If we don't agree with being tracked, the correct response is to eschew Google, not to bypass the tracking."

        Those who consent are probably making less than informed consent.

        Sadly, you are spot on the money. I know what is happening, I avoid those sites or bypass the tracking - I'm lucky, I have the skills to identify and avoid. Others don't have these skills (although I often think ignorance would make my world seem a more rosy place). The question thus becomes: how do we (the people that know) help people that don't know understand the problem and make an informed ethical choice to avoid these services, no matter how shiny they are?

        • (Score: 5, Interesting) by deimtee on Thursday June 02 2022, @10:59AM

          by deimtee (3272) on Thursday June 02 2022, @10:59AM (#1249685) Journal

          I respectfully disagree.
          Your position assumes that the other side is just as ethical as you, and that if you don't use their services they won't track you. That has been shown to be false, from 0,0 pixels and supercookies to Facebook's "ghost" profiles of people who won't sign up for an account.

          Showing everyone how to bypass the ads and tracking while still using the "shiny" is the best way to instigate change. Make the tracking both expensive and useless and eventually they'll stop doing it.

          --
          No problem is insoluble, but at Ksp = 2.943×10−25 Mercury Sulphide comes close.
      • (Score: 0) by Anonymous Coward on Thursday June 02 2022, @05:00PM (1 child)

        by Anonymous Coward on Thursday June 02 2022, @05:00PM (#1249829)

        Few of us ever consented to that deal.

        Almost all of us have consented to that deal, and continue to do so every day. If you use a website or app that tracks you and serves you adds you can stop. Sure, then you wouldn't get the benefits of use. But you could stop. You may whine about how many adds there are or how much tracking there is, but unless you stop, you are consenting. You may be poorly informed about what you are consenting to, but you could read the terms of service.

  • (Score: 0) by Anonymous Coward on Thursday June 02 2022, @12:23PM (2 children)

    by Anonymous Coward on Thursday June 02 2022, @12:23PM (#1249711)

    what DuckDuckGo is to Bing.

    Whores all the way down.

    • (Score: 0) by Anonymous Coward on Thursday June 02 2022, @05:47PM

      by Anonymous Coward on Thursday June 02 2022, @05:47PM (#1249851)

      I'm guessing you didn't do too well on those analogies and comparisons part of the SAT, did you?

    • (Score: 0) by Anonymous Coward on Thursday June 02 2022, @06:22PM

      by Anonymous Coward on Thursday June 02 2022, @06:22PM (#1249869)

      This is an impersonator trying to discredit similarly worded posts by another author. This poster is purposely wrong while using the same words as another poster who is right. In this case acting like Whoogle is in any way equivalent to DDG.

(1)