Software engineers go crazy for the most ridiculous things. We like to think that we're hyper-rational, but when we have to choose a technology, we end up in a kind of frenzy — bouncing from one person's Hacker News comment to another's blog post until, in a stupor, we float helplessly toward the brightest light and lay prone in front of it, oblivious to what we were looking for in the first place.
This is not how rational people make decisions, but it is how software engineers decide to use MapReduce.
As Joe Hellerstein sideranted to his undergrad databases class (54 min in):
The thing is there's like 5 companies in the world that run jobs that big. For everybody else... you're doing all this I/O for fault tolerance that you didn't really need. People got kinda Google mania in the 2000s: "we'll do everything the way Google does because we also run the world's largest internet data service" [tilts head sideways and waits for laughter]
Having more fault tolerance than you need might sound fine, but consider the cost: not only would you be doing much more I/O, you might be switching from a mature system—with stuff like transactions, indexes, and query optimizers—to something relatively threadbare. What a major step backwards. How many Hadoop users make these tradeoffs consciously? How many of those users make these tradeoffs wisely?
Source: https://blog.bradfieldcs.com/you-are-not-google-84912cf44afb
(Score: 2) by Nerdfest on Saturday June 10 2017, @09:08PM
That's pretty funny, I was going to pick FaceBook as a bad example as well. They overcame it, but it took quite a lot of effort. As someone else mentioned, imagine if they used proprietary tech. I think StackOverflow did for software (and I'm not sure how much they get boned on licences), but they went cheap on hardware, which was good. I've worked for places where they "buy" lameframe processing from IBM and what would run on a few PCs costs millions.