Stories
Slash Boxes
Comments

SoylentNews is people

posted by n1 on Thursday July 28 2016, @06:30AM   Printer-friendly
from the they-forgot-about-mssql dept.

[redacted] Coward writes:

https://eng.uber.com/mysql-migration/

The early architecture of Uber consisted of a monolithic backend application written in Python that used Postgres for data persistence. Since that time, the architecture of Uber has changed significantly, to a model of microservices and new data platforms. Specifically, in many of the cases where we previously used Postgres, we now use Schemaless, a novel database sharding layer built on top of MySQL. In this article, we’ll explore some of the drawbacks we found with Postgres and explain the decision to build Schemaless and other backend services on top of MySQL.

[...] We encountered many Postgres limitations:

Inefficient architecture for writes
Inefficient data replication
Issues with table corruption
Poor replica MVCC support
Difficulty upgrading to newer releases


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 1, Disagree) by Anonymous Coward on Thursday July 28 2016, @06:42AM

    by Anonymous Coward on Thursday July 28 2016, @06:42AM (#381080)

    The real news here is trendy Uber appers were using dinosaur tech in the first place. MySQL has been the darling of lampers for, like, ever.

    • (Score: 3, Interesting) by frojack on Thursday July 28 2016, @07:30AM

      by frojack (1554) Subscriber Badge on Thursday July 28 2016, @07:30AM (#381094) Journal

      The real news here is trendy Uber appers were using dinosaur tech in the first place. MySQL has been the darling of lampers for, like, ever.

      Not so fast.

      MySQL and Postgres have been dukeing it out for decades, and there was a loooong period of time when MySQL was just the upstart usurper, with a bunch of missing features and fragile releases. By the time Oracle stepped in it was pretty solid and had added most of the features it needed. Good thing too, because lots of devs ran away to MariaDB.

      Probably when Uber hacked what ever they converted to start their system they should have started from scratch, but chances are they had something working when they started, or the only tool they had was a hammer so all the problems looked like nails.

      --
      No, you are mistaken. I've always had this sig.
    • (Score: 5, Touché) by wonkey_monkey on Thursday July 28 2016, @07:44AM

      by wonkey_monkey (279) on Thursday July 28 2016, @07:44AM (#381095) Homepage

      MySQL has been the darling of lampers for, like, ever.

      Well... yeah. Kinda by definition.

      --
      systemd is Roko's Basilisk
    • (Score: 2) by driverless on Friday July 29 2016, @09:29AM

      by driverless (4770) on Friday July 29 2016, @09:29AM (#381461)

      The real news here is trendy Uber appers were using dinosaur tech in the first place.

      Yeah, but it's not terribly interesting stuff, I mean switching to shemales is pretty passe. Now if I got picked up by a car driven by some hot futa, that'd be another thing.

  • (Score: 3, Funny) by Anonymous Coward on Thursday July 28 2016, @07:25AM

    by Anonymous Coward on Thursday July 28 2016, @07:25AM (#381091)

    [...] We encountered many Postgres limitations:

    Inefficient architecture for writes
    Inefficient data replication
    Issues with table corruption
    Poor replica MVCC support
    Difficulty upgrading to newer releases
    Inability to screw over drivers
    Excessive accuracy on actual drive times
    No way to overvalue tips to drivers
    Inability to determine sex-offender status of driver, or passengers
    Quantum instability on stock price

    Fixed some of that for you, bro!

  • (Score: 3, Insightful) by mtrycz on Thursday July 28 2016, @07:50AM

    by mtrycz (60) on Thursday July 28 2016, @07:50AM (#381096)

    Postgres, OTOH, has other (functional) advantages over MySQL/Maria. I've just skimmed thru TFA and it looks to me that Uber needed architectural advantages, not functional; namely scaling-out (because you can scale-up just so much) (insert obligatory mongodb joke here).

    There are tradeoffs to be made, when you're a big company, or a company that needs to grow so fast.

    --
    In capitalist America, ads view YOU!
    • (Score: 5, Interesting) by Common Joe on Thursday July 28 2016, @09:45AM

      by Common Joe (33) <common.joe.0101NO@SPAMgmail.com> on Thursday July 28 2016, @09:45AM (#381117) Journal

      Fully agree.

      I didn't read the whole thing, but I read portions. The only relevant complaint that jumped out at me was the 9.2 bug that corrupted... data? I don't know. He was complaining about his indexes being corrupted in the second half of that section. I didn't fully follow what he was trying to say. Corrupted indexes are very bad but easily fixable. Corrupted data is a lot worse, though.

      And I completely agree that this is an architectural problem. Some of things the author is complaining about are safety mechanisms in PostgreSQL to ensure ACID compliance. And a quick Google search says MySQL is still not ACID compliant. On this one point alone, I say PostgreSQL is superior. Yes, of course you'll take a speed hit for ACID compliance. I'm quite sure many MySQL people don't understand how important this is... the like author maybe? He's kind of bitching about the bug that corrupted data. Well, you have to expect that sometimes in a non-ACID environment. This doesn't let PostgreSQL off the hook for the bug, but it sort of nullifies his argument for using MySQL.

      In the section on write amplification, he says

      For instance, if we have a table with a dozen indexes defined on it, an update to a field that is only covered by a single index must be propagated into all 12 indexes to reflect the ctid for the new row.

      And? The reason PostgreSQL writes a complete new record (and corresponding 12 indexes) is to ensure that the user running a long running query isn't adversely affected by another user updating the data. I seriously question the use of 12 indexes on a table with a lot of writes, though. Of course it's going to be slow for large tables. That's an architectural issue, not a PostgreSQL problem.

      Yes everything is case dependent and PostgreSQL is not perfect for everything, but I have yet to see any argument where MySQL is superior to PostgreSQL with high volume usage that requires data to be absolutely pristine. (And if anyone has anything, I'd be happy to read it.)

      I suppose on the plus side, this article will keep the PostgreSQL guys on their toes. Competition is good to keep the project healthy.

      • (Score: 3, Touché) by sendafiolorkar on Thursday July 28 2016, @12:37PM

        by sendafiolorkar (6300) on Thursday July 28 2016, @12:37PM (#381152)

        Folks, this is why i visit soylentnews!

        • (Score: -1, Troll) by Anonymous Coward on Thursday July 28 2016, @01:05PM

          by Anonymous Coward on Thursday July 28 2016, @01:05PM (#381159)
          To get misinformed opinions from people who don't read stuff properly nor understand it? You can get a lot of that from Slashdot too.
      • (Score: 4, Informative) by Anonymous Coward on Thursday July 28 2016, @01:02PM

        by Anonymous Coward on Thursday July 28 2016, @01:02PM (#381158)

        I didn't read the whole thing, but I read portions.

        The reason PostgreSQL writes a complete new record (and corresponding 12 indexes) is to ensure that the user running a long running query isn't adversely affected by another user updating the data

        You should read and understand more first before shooting your mouth off. MySQL InnoDB handles the scenario you mention with MVCC (just like Postgres), but they implement it differently. Read this bit in particular:

        This design means that InnoDB is at a slight disadvantage to Postgres when doing a secondary key lookup, since two indexes must be searched with InnoDB compared to just one for Postgres. However, because the data is normalized, row updates only need to update index records that are actually changed by the row update. Additionally, InnoDB typically does row updates in place. If old transactions need to reference a row for the purposes of MVCC MySQL copies the old row into a special area called the rollback segment.

        Let’s follow what happens when we update al-Khwārizmī’s birth year. If there is space, the birth year field in the row with id 4 is updated in place (in fact, this update always happens in place, as the birth year is an integer that occupies a fixed amount of space). The birth year index is also updated in place to reflect the new date. The old row data is copied to the rollback segment. The primary key index does not need to be updated, nor does the (first, last) name index. If we have a large number of indexes on this table, we still only have to update the indexes that actually index over the birth_year field. So say we have indexes over fields like signup_date, last_login_time, etc. We don’t need to update these indexes, whereas Postgres would have to.

        Perhaps there are other scenarios where Postgresql's design would work better than MySQL InnoDB but for Uber's case MySQL works better.

        • (Score: 4, Interesting) by Common Joe on Thursday July 28 2016, @01:56PM

          by Common Joe (33) <common.joe.0101NO@SPAMgmail.com> on Thursday July 28 2016, @01:56PM (#381184) Journal

          I gave you an "informative point" for clarification on your rebuttal.

          Maybe MySQL is the better choice for Uber. However, this design only buys you a certain amount of power in this particular circumstance. As mtrycz said at the top of this thread "it looks to me that Uber needed architectural advantages, not functional; namely scaling-out (because you can scale-up just so much)". They are going to bump into a scale-up issue even with MySQL. In the end, with both databases, architectural issues will have to be addressed. In my experience, every major program has weaknesses that have to be worked around; I think they'll find something equally as frustrating in MySQL. I suppose this point (heavy writing with many indexes) can be viewed as a weakness in PostgreSQL, but I still question changing the database over this one issue. But they listed several, so let's explore.

          From their list of grievances at the top: 1) Inefficient architecture for writes; I disagree. There are some pretty hefty databases that use PostgreSQL with heavy writing. 2) Inefficient data replication and poor MVCC support; I will grant you that MySQL was better than PostgreSQL for a long time. I don't know how they measure up now since PostgreSQL has made very significant strides in the past three to five years concerning replication. It's hard to make apples-to-apples comparisons with databases. MySQL may still have the edge with speed and ease here. 3) Issues with table corruption; definitely a ding against PostgreSQL, but MySQL is going to have table corruption also because of ACID compliance 4) Difficulty upgrading to newer release; This to me says they have architectural issues. On large tables, everyone would love updates to go quickly, but on a table with millions and millions of rows, that just isn't going to happen in any database. The application will have to adapt to it. According to Wikipedia [wikipedia.org], this kind of thing happens about once a year for PostgreSQL.

          My biggest problem is still the ACID stuff. They want perfect data, but they don't want to go ACID.

          Am I a DBA? No. Am I a database expert? I wouldn't consider myself such. (And so, if you find something wrong with what I say, please continue to correct me.) I just found it hard to digest many of their arguments, though. And I still find it hard to digest their argument about "write speeds".

          • (Score: 2) by Common Joe on Thursday July 28 2016, @02:09PM

            by Common Joe (33) <common.joe.0101NO@SPAMgmail.com> on Thursday July 28 2016, @02:09PM (#381190) Journal

            One important point of clarification for those who find this thread interesting (which many seem to). This whole discussion about the 12 indexes and MySQL vs PostgreSQL is only valid for update statements applied to existing rows. Every insert statement by either database will face the same ugly write penalty. This is why having many indexes on write-heavy tables are generally frowned upon.

            • (Score: 1, Interesting) by Anonymous Coward on Thursday July 28 2016, @03:13PM

              by Anonymous Coward on Thursday July 28 2016, @03:13PM (#381219)

              FWIW I prefer Postgresql to MySQL. I don't need to go Uber scale anytime soon, so hopefully by the time I might encounter such an issue the Postgres bunch would have worked out a way around it.

              They seem to acknowledge there is a problem: https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg290116.html [mail-archive.com]

          • (Score: 2, Funny) by Anonymous Coward on Thursday July 28 2016, @02:39PM

            by Anonymous Coward on Thursday July 28 2016, @02:39PM (#381203)

            Will you two just stop disagreeing with each other in a civil manner? This is the internet and one of you is an AC! Fight! Fight! Fight!

            • (Score: 1, Touché) by Anonymous Coward on Thursday July 28 2016, @02:50PM

              by Anonymous Coward on Thursday July 28 2016, @02:50PM (#381208)
              I wasn't very civil. In my earlier post I told him not to shoot his mouth off. Does that make you feel better?
              • (Score: 3, Touché) by Common Joe on Thursday July 28 2016, @03:14PM

                by Common Joe (33) <common.joe.0101NO@SPAMgmail.com> on Thursday July 28 2016, @03:14PM (#381220) Journal

                Sounds like you're the AC who was "uncivil" to me. I'm glad you're reading my followup stuff.

                What I'm most happy about is you had a good and informative reply. (Thank you.) Yeah, you were a bit uncivil, but you're an AC, this is the Internet, and I tried to pay attention more to learning from your response.

              • (Score: 0) by Anonymous Coward on Thursday July 28 2016, @03:46PM

                by Anonymous Coward on Thursday July 28 2016, @03:46PM (#381225)

                Does that make you feel better?

                Not exactly. There was no name calling or personal attacks. Not even mention of his insensitiveness or his clodiness. Ugh, I'm going over to 4chan to balance the internet's scales.

                • (Score: 2) by bzipitidoo on Thursday July 28 2016, @04:31PM

                  by bzipitidoo (4388) Subscriber Badge on Thursday July 28 2016, @04:31PM (#381244) Journal

                  You want a fight? Ask these:

                  Which P in LAMP is best, PHP, Python, or Perl?

                  And for that matter, Apache, nginx, lighttpd, or something else?

                  vi or emacs?

                  Is systemd Lucifer in the flesh?

                  SoylentNews or Slashdot?

                  • (Score: 2) by mtrycz on Thursday July 28 2016, @07:51PM

                    by mtrycz (60) on Thursday July 28 2016, @07:51PM (#381299)

                    The obvious response to each end every one is:

                    To each their own.

                    Except Slashdot. Obviously :3

                    --
                    In capitalist America, ads view YOU!
          • (Score: 2, Informative) by Bronster on Friday July 29 2016, @02:52AM

            by Bronster (356) on Friday July 29 2016, @02:52AM (#381407) Homepage

            I'd be happy to correct you here, as others already have.

            InnoDB (the mysql backend engine that Uber are using) is fully ACID. Your continual harping on that incorrect data point already disqualifies your opinion.

            I've used both engines, but only mysql with replication. Cross-datacentre traffic is a real issue, even at much lower scale than Uber. We ship compressed SQL statements at FastMail, and every time I've considered switching to Postgres I would have built or required the same replication strategy - both for protection against replicating corruption, and for efficiency. So far we haven't found enough advantages to Postgres to justify the switching cost.

            We've been running mysql for over 10 years with no dataloss or crash that can be attributed to mysql. We've upgraded multiple versions with no downtime due to replication being forwards compatible. That's pretty amazing really. You can't just "shut down" something like Uber while you perform a database upgrade.

            Our time tracking system when we were part of Opera had to get upgraded. I'm pretty sure it was Oracle under the hood. They took a _1 week_ maintenance window to upgrade it. Can you imagine taking Uber offline for a week? Insanity. Upgradability without downtime or massive performance hits isn't "nice to have", it's business critical. And being able to bring up replicas with the new code first and then pivot - yep, that's what we do too - it's how you keep the lights on while making major changes.

            In summary - your points against MySQL are FUD, pure and simple. You don't actually know anything about it and you're talking shit based on hearsay. Shame on you. I read the Uber article and I learned valuable things to check in any new replication system we might consider to replace the current MySQL that we're using, or any redesign work we might be doing to the Cyrus IMAP server where I am one of the authors of the replication system.

            • (Score: 3, Interesting) by Common Joe on Friday July 29 2016, @05:54AM

              by Common Joe (33) <common.joe.0101NO@SPAMgmail.com> on Friday July 29 2016, @05:54AM (#381441) Journal

              Allow me to 1) say thank you for teaching me something and 2) to correct you. It took some digging, but I know where our misunderstanding comes from.

              In summary - your points against MySQL are FUD, pure and simple. You don't actually know anything about it and you're talking shit based on hearsay. Shame on you.

              First of all, I was not spreading "Fear, Uncertainty, and Doubt". And you, as a MySQL fanboy, should understand MySQL's history because I am not talking shit nor hearsay. As a matter of fact, it is still entirely possible that MySQL is not ACID complaint with certain settings even today.

              With that said, you have taught me that MySQL can be ACID compliant. Let's get into specifics of our misunderstanding so that you may grow and correct others properly.

              First of all, I did a quick Google search "Is MySQL ACID Compliant? [google.de]" The answer it came back was no. (This is how bad MySQL's reputation is.) Looking at the answer again, it seems to be pulling an answer from 2001 which is quite unfair to MySQL.

              This article [ronaldbradford.com] seems to have the best answers out of anything I've found. Apparently, MySQL can have different database engines. The MyISAM engine is not ACID compliant while the InnoDB engine is ACID compliant. Version 5.5 came out in December 2010 [wikipedia.org] and was the first version to default to the InnoDB engine. So, by default, MySQL was not ACID compliant before December 2010. To further quote the article:

              But the damage to the ecosystem that uses MySQL, that is many thousands of open source projects, and the resources that work with MySQL has been done. Recently working on a MySQL 5.5 production system in 2016, the default engine was specifically defined in the configuration defined as MyISAM, and some (but not all tables) were defined using MyISAM.

              In a LAMP setup, who knew what you were getting? And that is where MySQL got a well deserved, bad reputation that lingers even to today. And one of a thousand reasons why the MySQL developers left to create a fork called MariaDB, which I understand was (is?) much better than MySQL. Interestingly enough, this guy [rdx.com] insisted that MySQL 5.5 was still not ACID compliant, although that was November 2010 before general release and about 5.5.6. (I don't know what engine he was using and I'm not going to dig into the particulars of his blog. I just thought it interesting.)

              This article [rackspace.com] specifically gets into the differences between MyISAM and InnoDB. What scares me is that, if I'm understanding correctly, specific tables can use one engine or the other. Holy shit. In my opinion, that is messed up and you're asking for a hell of a lot of trouble if you do anything like that.

              So, in short, you may correct me, but don't tell me I'm talking shit. I've been around the block with databases long enough to know to rightly give MySQL a wide berth. Hell, because MySQL is owned by Oracle, I would still rather use MariaDB over MySQL over that one fact alone. I mean, what can you say when Oracle doesn't care enough about MySQL to fix the reputation and confusion they've generated?

              I've used both engines, but only mysql with replication.

              You've used MyISAM? That one is not ACID compliant. I'm not saying it's the wrong choice. I just hope you understand the pros and cons.

              So far we haven't found enough advantages to Postgres to justify the switching cost.

              Fair enough. Changing databases is often expensive. And I will admit that MySQL had a lot better tools than PostgreSQL for many, many years -- especially when it came to replication.

              We've been running mysql for over 10 years with no dataloss or crash that can be attributed to mysql.

              Lucky you. Others weren't so lucky because they used the defaults from 10 years ago.

              We've upgraded multiple versions with no downtime due to replication being forwards compatible. That's pretty amazing really. You can't just "shut down" something like Uber while you perform a database upgrade.

              I'm no DBA so I don't understand what you did that would be different than PostgreSQL. MySQL upgrade documentation [mysql.com] says

              mysql_upgrade processes all tables in all databases, which might take a long time to complete. Each table is locked and therefore unavailable to other sessions while it is being processed. Check and repair operations can be time-consuming, particularly for large tables.

              So I don't know what you did different from PostgreSQL. I mean, frankly, if there is a single database running in production that needs to be up 24/7, something is wrong. With APIs and interfaces, I would imagine an application should be able to handle different versions of databases without taking out production if it were critical to keep it up and going.

              Our time tracking system when we were part of Opera had to get upgraded. I'm pretty sure it was Oracle under the hood. They took a _1 week_ maintenance window to upgrade it. Can you imagine taking Uber offline for a week? Insanity

              A time tracking system took a week to update? It sounds like there was a problem with the programmers who made the time tracking system. I worked on an Oracle system where the DBAs did the updates in stages for this reason. (The entire update took the entire weekend.) The application has since been updated to better handle very large tables better during the upgrade process.

              I don't know what makes MySQL special or faster. If you have a reason why it is faster on updates, I'd be interested to see it. Otherwise, I would be very careful that a future doesn't cost you dearly in time. You may be sitting on a time bomb.

      • (Score: 0) by Anonymous Coward on Thursday July 28 2016, @02:57PM

        by Anonymous Coward on Thursday July 28 2016, @02:57PM (#381212)

        12 indexes on 1 table shows a need for a different schema usually. With that many I would venture a guess and say it was a very wide table or badly designed indexes. It could be the redesign took care of that.

        I have seen a few systems like this in the past. It seems to be a slow moving SQL anti pattern. Basically some key table starts picking up columns. They get wider and wider. Then some perf issue shows up and the table picks up a new index.

        I have no 'skin' in the game on this one. I am a MSSQL guy. But I have seen the same sorts of decisions made before. Pretty much every time it is 'while we are doing this lets fix...'.

        With a table like that it probably needed a star topology. It really just depends on do you want to tune for reads or writes. In this case they tuned for reads. But writes were a bigger issue so they needed to tune for that. Another way to fix this would be to have a 2 stage commit where you blast the data into the DB then have another service come by and fix it up. But at that point you should start looking into a NoSQL sort of solution.

        • (Score: 2, Funny) by Anonymous Coward on Thursday July 28 2016, @04:04PM

          by Anonymous Coward on Thursday July 28 2016, @04:04PM (#381230)

          12 indexes on 1 table shows a need for a different schema usually.

          Maybe you didn't read TFS carefully, but those 12 indexes are part of an Über schema ;-)

        • (Score: 0) by Anonymous Coward on Thursday July 28 2016, @08:05PM

          by Anonymous Coward on Thursday July 28 2016, @08:05PM (#381304)

          I am a MSSQL guy

          i love it when microsoft whores act like their "choice" is legitimate. stupid @#%$

          • (Score: 0) by Anonymous Coward on Thursday July 28 2016, @09:23PM

            by Anonymous Coward on Thursday July 28 2016, @09:23PM (#381332)

            Dont know its done alright by me. By my last estimate I would say ~1.5 million in earned income.

            I did not disparage mysql or Postgres. MSSQL is a very competent SQL implementation. When the boss (who is cutting the checks) says we are using MSSQL you figure it out....

            I have used just about every SQL out there. I am most proficient in MSSQL. You wooshed my point to make fun of me. Sad. I will restate it simply for your small troll mind. My point was the redesign probably did more good than switching databases.

    • (Score: 4, Interesting) by VLM on Thursday July 28 2016, @02:23PM

      by VLM (445) on Thursday July 28 2016, @02:23PM (#381199)

      The "real" difference between mysql and postgresql is merely philosophical. With about 95% accuracy you can predict the design and architectural and implementation differences using philosophy. Seriously.

      Almost always mysql is cub scout "try your best". If thats not perfect, well, at least we tried our reasonable best. At least 99% of our users are really happy and if you're in that 1% well really sorry, but at least we're fast.

      Almost always postgresql is perfectionist or bust. If thats not perfect, crash out with error and return failure codes and stuff. We might even give you an error message that is not misleading and written in Klingon, or maybe not, and we might be slower than molasses in January but at least we're perfect.

      Needless to say one outlook on life is enormously faster, another is more "trustworthy" by accuracy metrics (although much worse by uptime and ship date metrics)

      So the story is their app went from monolithic where you gotta choose one or the other whichever fits the best, however poorly, to microservices where you can always pick the "best" tool for the philosophical job. Unsurprisingly that lead to a mix. I'll take a wild guess that something like marketing survey stuff uses lightning fast mysql and something reliability oriented like financial charging register probably uses postgresql.

      I've tried stuff like this at work (not uber) and its an ungodly pain in the ass that somehow Great Cthulhu's Flipper magically inspires non-technical users to demand a business critical report like five minutes after I implement the separate servers that could have been done in a minute with a simple JOIN but now they're separate servers and separate architectures its gonna be a survivable PITA... Like most problems there are multiple solutions and after much heavy drinking I have played with things like a mysql "sorta mirror" of the postgres server accessed from the real mysql server as a federated engine. Mostly because I'd never done anything interesting with federated engines on mysql and after using it I'm pretty comfortable never peering into that abyss again. Although it was weirdly fun.

      Something interesting to point out about "slow" and "fast" from an old timer, is unless you're the worlds biggest whatever, everything is fast on commodity hardware. Also everything thats "slow" today on commodity hardware is "fast" on commodity hardware in about five years, so unless you're in a hurry it don't matter. So murphy's law naturally is if you can't find your business process bottlenecks, just use a slow DB or whatever and the users and trouble tickets will isolate it for you "for free".

  • (Score: 5, Insightful) by stormreaver on Thursday July 28 2016, @02:02PM

    by stormreaver (5101) on Thursday July 28 2016, @02:02PM (#381187)

    I stopped reading at this point:

    The bug we ran into only affected certain releases of Postgres 9.2 and has been fixed for a long time now. However, we still find it worrisome that this class of bug can happen at all. A new version of Postgres could be released at any time that has a bug of this nature, and because of the way replication works, this issue has the potential to spread into all of the databases in a replication hierarchy.

    This is where the author lost all credibility for impartiality. He is clearly a MySQL fan, and is taking the long road to pretending to not be so. The reasons I make this conclusion are two-fold:

    1) Bugs appear everywhere, even on usually rock-solid platforms such as PostgreSQL. Re-architecting your entire system due to a long-since-fixed bug is absurd.

    2) MySQL is the king of data-destroying bugs (and data-destroying mis-features), so it is disingenuous at best to use the possibility of data corruption as a reason for using MySQL over PostgreSQL. With MySQL, you will corrupt your database. It's just a matter of time. And in the case of MySQL's data-destroying mis-features, there will be no way to fix it retroactively.

    • (Score: 2) by edIII on Thursday July 28 2016, @05:55PM

      by edIII (791) on Thursday July 28 2016, @05:55PM (#381268)

      With MySQL, you will corrupt your database. It's just a matter of time. And in the case of MySQL's data-destroying mis-features, there will be no way to fix it retroactively.

      Was laughing my ass off as well. MySQL started out as a free toy database, quickly designed with, and poorly integrated into many open source projects. In its early days, you would hear horror stories for MySQL murdering databases and destroying projects . Meanwhile, I was playing around with Interbase and Firebird... you know... with ACID :) I never actually lost a Firebird database in over 10 years, before eventually switching to PostgreSQL.

      MySQL has got a lot better in the last 10 years, but that doesn't erase it's rocky and spotty history with accuracy and data integrity. It was Hitler complaining about how many people were killed last year by rogue lawnmowers.

      On occasion I'll work with MySQL, but I really don't like how it handles stored procedures and code. Not to mention views, queries, etc. I can get in there and perform maintenance, and tweaks, but I'll never design an entire platform with it on purpose. I've spent the time deciding between MySQL and PostgreSQL as both work with Asterisk 13. Going with PostgreSQL, in particular, for its excellent support on OpenBSD, especially with Perl scripts built-in to the database. Yeah, I want something where the project developers actually love, understand, and demand ACID.

      --
      Technically, lunchtime is at any moment. It's just a wave function.
      • (Score: 2) by Thexalon on Thursday July 28 2016, @06:01PM

        by Thexalon (636) Subscriber Badge on Thursday July 28 2016, @06:01PM (#381270)

        MySQL has got a lot better in the last 10 years, but that doesn't erase it's rocky and spotty history

        Which is one reason it makes a perfect pairing with PHP, which has the exact same problem! And in both cases, one of the problems is that they have to deal with legacy code the relies on the stupid decisions they made years ago before they knew better.

        Also, apropos of nothing else, I agree that the Linux Standard Database, or "LSD" for short, ought to be ACID-compliant.

        --
        The only thing that stops a bad guy with a compiler is a good guy with a compiler.
    • (Score: 1) by Bronster on Friday July 29 2016, @02:56AM

      by Bronster (356) on Friday July 29 2016, @02:56AM (#381408) Homepage

      Wow. I didn't read that at all.

      I run Cyrus IMAP with its application level replication protocol rather than DRBD under the hood because I know that I can screw up and break the internals in various ways - and I'd rather be shipping descriptive changes with protocol level invariant checking rather than shipping raw disk blocks and replicating any corruption without being able to apply additional sanity checks at the replication layer.

      (I've also run MySQL in production, purely InnoDB, for over 10 years without data being destroyed, so I'm amused at your FUD about how it will corrupt your database)

      • (Score: 2) by stormreaver on Friday July 29 2016, @01:05PM

        by stormreaver (5101) on Friday July 29 2016, @01:05PM (#381510)

        I've also run MySQL in production, purely InnoDB, for over 10 years without data being destroyed, so I'm amused at your FUD about how it will corrupt your database

        I was being a bit hyperbolic about MySQL's data corruption, but the basic idea is true. MySQL changes your data to fit its constraints, rather than telling you that you passed bad data (which is once of its frequent data corrupting mis-features). And MySQL has a long and steady history of spontaneous database corruption after the fact, for no apparent reason other than bad internals.

        I've watched MySQL self-destruct enough times to no longer even consider it an option for anything even remotely important.

  • (Score: 4, Interesting) by mtrycz on Thursday July 28 2016, @08:47PM

    by mtrycz (60) on Thursday July 28 2016, @08:47PM (#381318)

    Found thins on the Hacker News discussion, thought it might be useful.

    Uber's 2013 MySQL to PostgreSQL migration

    https://www.yumpu.com/en/document/view/53683323/migrating-uber-from-mysql-to-postgresql [yumpu.com]

    --
    In capitalist America, ads view YOU!
    • (Score: 3, Touché) by NCommander on Friday July 29 2016, @01:01AM

      by NCommander (2) Subscriber Badge <michael@casadevall.pro> on Friday July 29 2016, @01:01AM (#381385) Homepage Journal

      Just reading that shows the argument on migrating to MySQL for reasons of data corruption is complete bull, given they had plenty due to MySQLisms beforehand. I've wanted to gut the site and change out our database from MySQL cluster to PG for awhile but lack of time and energy has prevented it.

      --
      Still always moving
      • (Score: 2) by TheLink on Friday July 29 2016, @07:18PM

        by TheLink (332) on Friday July 29 2016, @07:18PM (#381677) Journal

        What I find interesting was they couldn't upgrade from old Postgresql to new Postgresql but they could migrate from old Postgresql to Schemaless+MySQL/MySQL.

        Is it because Schemaless/MySQL was a lot more hardware-efficient? e.g. migrating from 1000 servers to 100 new servers is a lot easier on the budget than migrating from 1000 servers to another 1000 servers (assuming newer versions of Postgresql still have the same performance problems for their use-case - which seems likely).

        See also: http://planet.mysql.com/entry/?id=5997307 [mysql.com]