Stories
Slash Boxes
Comments

SoylentNews is people

posted by n1 on Thursday July 28 2016, @06:30AM   Printer-friendly
from the they-forgot-about-mssql dept.

[redacted] Coward writes:

https://eng.uber.com/mysql-migration/

The early architecture of Uber consisted of a monolithic backend application written in Python that used Postgres for data persistence. Since that time, the architecture of Uber has changed significantly, to a model of microservices and new data platforms. Specifically, in many of the cases where we previously used Postgres, we now use Schemaless, a novel database sharding layer built on top of MySQL. In this article, we’ll explore some of the drawbacks we found with Postgres and explain the decision to build Schemaless and other backend services on top of MySQL.

[...] We encountered many Postgres limitations:

Inefficient architecture for writes
Inefficient data replication
Issues with table corruption
Poor replica MVCC support
Difficulty upgrading to newer releases


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2, Informative) by Bronster on Friday July 29 2016, @02:52AM

    by Bronster (356) on Friday July 29 2016, @02:52AM (#381407) Homepage

    I'd be happy to correct you here, as others already have.

    InnoDB (the mysql backend engine that Uber are using) is fully ACID. Your continual harping on that incorrect data point already disqualifies your opinion.

    I've used both engines, but only mysql with replication. Cross-datacentre traffic is a real issue, even at much lower scale than Uber. We ship compressed SQL statements at FastMail, and every time I've considered switching to Postgres I would have built or required the same replication strategy - both for protection against replicating corruption, and for efficiency. So far we haven't found enough advantages to Postgres to justify the switching cost.

    We've been running mysql for over 10 years with no dataloss or crash that can be attributed to mysql. We've upgraded multiple versions with no downtime due to replication being forwards compatible. That's pretty amazing really. You can't just "shut down" something like Uber while you perform a database upgrade.

    Our time tracking system when we were part of Opera had to get upgraded. I'm pretty sure it was Oracle under the hood. They took a _1 week_ maintenance window to upgrade it. Can you imagine taking Uber offline for a week? Insanity. Upgradability without downtime or massive performance hits isn't "nice to have", it's business critical. And being able to bring up replicas with the new code first and then pivot - yep, that's what we do too - it's how you keep the lights on while making major changes.

    In summary - your points against MySQL are FUD, pure and simple. You don't actually know anything about it and you're talking shit based on hearsay. Shame on you. I read the Uber article and I learned valuable things to check in any new replication system we might consider to replace the current MySQL that we're using, or any redesign work we might be doing to the Cyrus IMAP server where I am one of the authors of the replication system.

    Starting Score:    1  point
    Moderation   +1  
       Informative=1, Total=1
    Extra 'Informative' Modifier   0  

    Total Score:   2  
  • (Score: 3, Interesting) by Common Joe on Friday July 29 2016, @05:54AM

    by Common Joe (33) <common.joe.0101NO@SPAMgmail.com> on Friday July 29 2016, @05:54AM (#381441) Journal

    Allow me to 1) say thank you for teaching me something and 2) to correct you. It took some digging, but I know where our misunderstanding comes from.

    In summary - your points against MySQL are FUD, pure and simple. You don't actually know anything about it and you're talking shit based on hearsay. Shame on you.

    First of all, I was not spreading "Fear, Uncertainty, and Doubt". And you, as a MySQL fanboy, should understand MySQL's history because I am not talking shit nor hearsay. As a matter of fact, it is still entirely possible that MySQL is not ACID complaint with certain settings even today.

    With that said, you have taught me that MySQL can be ACID compliant. Let's get into specifics of our misunderstanding so that you may grow and correct others properly.

    First of all, I did a quick Google search "Is MySQL ACID Compliant? [google.de]" The answer it came back was no. (This is how bad MySQL's reputation is.) Looking at the answer again, it seems to be pulling an answer from 2001 which is quite unfair to MySQL.

    This article [ronaldbradford.com] seems to have the best answers out of anything I've found. Apparently, MySQL can have different database engines. The MyISAM engine is not ACID compliant while the InnoDB engine is ACID compliant. Version 5.5 came out in December 2010 [wikipedia.org] and was the first version to default to the InnoDB engine. So, by default, MySQL was not ACID compliant before December 2010. To further quote the article:

    But the damage to the ecosystem that uses MySQL, that is many thousands of open source projects, and the resources that work with MySQL has been done. Recently working on a MySQL 5.5 production system in 2016, the default engine was specifically defined in the configuration defined as MyISAM, and some (but not all tables) were defined using MyISAM.

    In a LAMP setup, who knew what you were getting? And that is where MySQL got a well deserved, bad reputation that lingers even to today. And one of a thousand reasons why the MySQL developers left to create a fork called MariaDB, which I understand was (is?) much better than MySQL. Interestingly enough, this guy [rdx.com] insisted that MySQL 5.5 was still not ACID compliant, although that was November 2010 before general release and about 5.5.6. (I don't know what engine he was using and I'm not going to dig into the particulars of his blog. I just thought it interesting.)

    This article [rackspace.com] specifically gets into the differences between MyISAM and InnoDB. What scares me is that, if I'm understanding correctly, specific tables can use one engine or the other. Holy shit. In my opinion, that is messed up and you're asking for a hell of a lot of trouble if you do anything like that.

    So, in short, you may correct me, but don't tell me I'm talking shit. I've been around the block with databases long enough to know to rightly give MySQL a wide berth. Hell, because MySQL is owned by Oracle, I would still rather use MariaDB over MySQL over that one fact alone. I mean, what can you say when Oracle doesn't care enough about MySQL to fix the reputation and confusion they've generated?

    I've used both engines, but only mysql with replication.

    You've used MyISAM? That one is not ACID compliant. I'm not saying it's the wrong choice. I just hope you understand the pros and cons.

    So far we haven't found enough advantages to Postgres to justify the switching cost.

    Fair enough. Changing databases is often expensive. And I will admit that MySQL had a lot better tools than PostgreSQL for many, many years -- especially when it came to replication.

    We've been running mysql for over 10 years with no dataloss or crash that can be attributed to mysql.

    Lucky you. Others weren't so lucky because they used the defaults from 10 years ago.

    We've upgraded multiple versions with no downtime due to replication being forwards compatible. That's pretty amazing really. You can't just "shut down" something like Uber while you perform a database upgrade.

    I'm no DBA so I don't understand what you did that would be different than PostgreSQL. MySQL upgrade documentation [mysql.com] says

    mysql_upgrade processes all tables in all databases, which might take a long time to complete. Each table is locked and therefore unavailable to other sessions while it is being processed. Check and repair operations can be time-consuming, particularly for large tables.

    So I don't know what you did different from PostgreSQL. I mean, frankly, if there is a single database running in production that needs to be up 24/7, something is wrong. With APIs and interfaces, I would imagine an application should be able to handle different versions of databases without taking out production if it were critical to keep it up and going.

    Our time tracking system when we were part of Opera had to get upgraded. I'm pretty sure it was Oracle under the hood. They took a _1 week_ maintenance window to upgrade it. Can you imagine taking Uber offline for a week? Insanity

    A time tracking system took a week to update? It sounds like there was a problem with the programmers who made the time tracking system. I worked on an Oracle system where the DBAs did the updates in stages for this reason. (The entire update took the entire weekend.) The application has since been updated to better handle very large tables better during the upgrade process.

    I don't know what makes MySQL special or faster. If you have a reason why it is faster on updates, I'd be interested to see it. Otherwise, I would be very careful that a future doesn't cost you dearly in time. You may be sitting on a time bomb.