SoylentNews Comments | Search Engine Findx Shuts Down

Search Engine Findx Shuts Down

posted by takyon on Wednesday November 21 2018, @04:00PM

from the found-and-lost dept.

isj writes:

The privacy-oriented search engine Findx has shut down: https://privacore.github.io/

The reasons cited are:

While people are starting to understand the importance of privacy it is a major hurdle to get them to select a different search engine.
Search engines eat resources like crazy, so operating costs are non-negligible.
Some sites (including e.g. github) use a whitelist in robots.txt, blocking new crawlers.
The amount of spam, link-farms, referrer-linking, etc. is beyond your worst nightmare.
Returning good results takes a long time to fine-tune.
Monetizing is nearly impossible because advertising networks want to know everything about the users, going against privacy concerns.
Buying search results from other search engines is impossible until you have least x million searches/month. Getting x million searches/month is impossible unless you buy search results from other search engines (or sink a lot of cash into making it yourself).

So what do you soylentils think can be done to increase privacy for ordinary users, search-engine-wise ?

Dislaimer: I worked at Findx.

Original Submission

Starting Score:

point

Moderation

Interesting=2, Total=2

Extra 'Interesting' Modifier

Karma-Bonus Modifier

Total Score:

This discussion has been archived. No new comments can be posted.

Search Engine Findx Shuts Down | Log In/Create an Account | Top | 68 comments | Search Discussion

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.

Re:Let's have a techie discussion. (Score: 4, Interesting) by isj on Wednesday November 21 2018, @07:14PM

by isj (5249) on Wednesday November 21 2018, @07:14PM (#764902) Homepage

No, you definitely don't want to store it a standard SQL database, even with full-text search.

Look for "inverted indexes".

Depending on the goal of your search engine you may be able to reduce the index size with:
- lemmatization
- stemming
- if word order doesn't matter then store occurrences only once per document

If the document set is relatively uniform (say, a set of scientific papers, or a set of children's books (but not a mix of both)) then you can use BM25 ranking algorithm for getting reasonably good results.

Parent

Starting Score:	1		point
Moderation		+2
Interesting=2, Total=2
Extra 'Interesting' Modifier		0
Karma-Bonus Modifier		+1

Total Score:		4

Moderator Help

SoylentNews

SoylentNews is people

Navigation

Sections

SoylentNews

Search Engine Findx Shuts Down

Re:Let's have a techie discussion. (Score: 4, Interesting) by isj on Wednesday November 21 2018, @07:14PM