Stories
Slash Boxes
Comments

SoylentNews is people

posted by n1 on Saturday June 10 2017, @11:07AM   Printer-friendly
from the i-am-spartacus dept.

Software engineers go crazy for the most ridiculous things. We like to think that we're hyper-rational, but when we have to choose a technology, we end up in a kind of frenzy — bouncing from one person's Hacker News comment to another's blog post until, in a stupor, we float helplessly toward the brightest light and lay prone in front of it, oblivious to what we were looking for in the first place.

This is not how rational people make decisions, but it is how software engineers decide to use MapReduce.

As Joe Hellerstein sideranted to his undergrad databases class (54 min in):

The thing is there's like 5 companies in the world that run jobs that big. For everybody else... you're doing all this I/O for fault tolerance that you didn't really need. People got kinda Google mania in the 2000s: "we'll do everything the way Google does because we also run the world's largest internet data service" [tilts head sideways and waits for laughter]

Having more fault tolerance than you need might sound fine, but consider the cost: not only would you be doing much more I/O, you might be switching from a mature system—with stuff like transactions, indexes, and query optimizers—to something relatively threadbare. What a major step backwards. How many Hadoop users make these tradeoffs consciously? How many of those users make these tradeoffs wisely?

Source: https://blog.bradfieldcs.com/you-are-not-google-84912cf44afb


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2, Interesting) by isj on Saturday June 10 2017, @03:02PM (1 child)

    by isj (5249) on Saturday June 10 2017, @03:02PM (#523497) Homepage

    But I still feel the lure of new interesting systems and technologies. I'm just suffering from the illusion that my experience makes me able to step back and think about it.

    The blog post is spot on.

    In a project I was fooled into using a nosql databases (couchdb and riak in this case). It turned out to be a bad idea because most of te data was not easily sharded or would give very unbalanced shards. Combined with the lack of referential integrity so the application would have to deal with inconsistencies I ended up scrapping it for a more traditional rdbms for that part of the data. I was able to do so because I'm the main developer and it's a small company. Some of the data stayed in riak because it was easily sharded. It wasn't fun anyway because the minimum installation requires 3 instances and sometimes it didn't recover from rebalancing. An other developer were struggling with the data retrieval and after months ended up with something that mostly worked (he wasn't as experienced as me so I don't blame him). I ended up scrapping that part too and replacing it with flat text files (for restore and later analysis) and pushing aggregates into a fancy statistics system, grafana+opentsdb, so the users can see graphs which is really what they needed.

    Yes, I know that OpenTSDB uses HBase which is essentially Hadoop. I'm fine with that because we only use one instance and I don't have to deal with the actual read+write to that.

    I think a useful mindset is: All systems are shit in some area. Choose the one that is least shitty.

    Starting Score:    1  point
    Moderation   +1  
       Interesting=1, Total=1
    Extra 'Interesting' Modifier   0  

    Total Score:   2  
  • (Score: 0) by Anonymous Coward on Saturday June 10 2017, @05:52PM

    by Anonymous Coward on Saturday June 10 2017, @05:52PM (#523531)

    I ended up scrapping it for a more traditional rdbms

    I've been telling people this for years:
    There are two sets of people: those who think they need a NoSQL database and are wrong on one side, and on the other side those that use an RDBMS.