SoylentNews Comments | GitLab Admin Deletes Database, 6 Hours of Data Lost

GitLab Admin Deletes Database, 6 Hours of Data Lost

posted by Fnord666 on Friday February 03 2017, @06:39AM

from the 5-backup-strategies-weren't-enough dept.

Ruby Paulson at BlogVault reports

GitLab, the online tech hub, is facing issues as a result of an accidental database deletion that happened in the wee hours of last night. A tired, frustrated system administrator thought that deleting a database would solve the lag-related issues that had cropped up... only to discover too late that he'd executed the command for the wrong database.
[...] It's certainly freaky that all the five backup solutions that GitLab had were ineffective, but this incident demonstrates that a number of things can go wrong with backups. The real aim for any backup solution, is to be able to restore data with ease... but simple oversights could render backup solutions useless.

Computer Business Review adds

The data loss took place when a system administrator accidentally deleted a directory on the wrong server during a database replication process. A folder containing 300GB of live production data was completely wiped.
[...] The last potentially useful backup was taken six hours before the issue occurred.
However, this is not seen to be of any help as snapshots are normally taken every 24 hours and the data loss occurred six hours after the previous snapshot which [resulted in] six hours of data loss.
David Mytton, founder and CEO [of] Server Density, said: "This unfortunate incident at GitLab highlights the urgent need for businesses to review and refresh their backup and incident handling processes to ensure data loss is recoverable, and teams know how to handle the procedure.

GitLab has been updating a Google Doc with info on the ongoing incident.

Additional coverage at:
TechCrunch
The Register

Original Submission

This discussion has been archived. No new comments can be posted.

GitLab Admin Deletes Database, 6 Hours of Data Lost | Log In/Create an Account | Top | 24 comments | Search Discussion

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.

We're in the same boat here We're in the same boat here (Score: 4, Informative) by The Mighty Buzzard on Friday February 03 2017, @11:39AM

by The Mighty Buzzard (18)

<themightybuzzard@proton.me> on Friday February 03 2017, @11:39AM (#462324) Homepage Journal

If we somehow managed to destroy our production database we would be in exactly the same position being as we don't back up any more often than they did. Maybe we should look at changing that for the production server at least. Then again, we're talking comments rather than code. What do you lot think?

--
My rights don't end where your fear begins.

Starting Score:	1		point
Moderation		+2
Informative=2, Total=2
Extra 'Informative' Modifier		0
Karma-Bonus Modifier		+1

Total Score:		4

Re:We're in the same boat here Re:We're in the same boat here (Score: 2) by pkrasimirov on Friday February 03 2017, @03:05PM

by pkrasimirov (3358) on Friday February 03 2017, @03:05PM (#462397)

It's called Disaster recovery [wikipedia.org] plan. Also one easy prevention measure is to ... change terminal PS1 format/colours to make it clear whether you’re using production or staging (red production, yellow staging). [gitlab.com]

Parent
- Re:We're in the same boat here Re:We're in the same boat here (Score: 2) by The Mighty Buzzard on Friday February 03 2017, @03:35PM
  
  by The Mighty Buzzard (18) <themightybuzzard@proton.me> on Friday February 03 2017, @03:35PM (#462428) Homepage Journal
  
  Eh, we do already have plans in place as well as daily local and offsite db dumps going back a good ways. I was asking more for specific ideas like the PS1 setting but more for the backups end of things. I mean we got a lot of smart folks here, might as well use them.
  
  --
  My rights don't end where your fear begins.
  
  Parent
  - Re:We're in the same boat here Re:We're in the same boat here (Score: 0) by Anonymous Coward on Friday February 03 2017, @04:24PM
    
    by Anonymous Coward on Friday February 03 2017, @04:24PM (#462458)
    
    IIRC, you use MySQL. In that case, turn on https://dev.mysql.com/doc/refman/5.7/en/binary-log.html [mysql.com] with a basename that points to a different directory. The logs will rotate automatically when they hit a certain size or the daemon is restarted. You can then restore, if necessary from the backup dump to get you close and then by replaying the binary logs in concatenated form. A tip though, before blindly replaying the transaction log, you should change it to human-readable form and edit out the unneeded stuff. Also, don't forget to do a FULL vacuum, analyse and reindex of the databases as well.
    
    Parent
    - Re:We're in the same boat here (Score: 2) by pkrasimirov on Friday February 03 2017, @04:50PM
      
      by pkrasimirov (3358) on Friday February 03 2017, @04:50PM (#462472)
      
      Good link, thanks. That "binary log" is essentially a journal. But it should be usable for quick replay of changes on top of last backup, otherwise it is not of much value. Restore (as well as backup) should be fully automated bullet-proof task exactly for the reason in the story: sometimes humans err and stress does not help. Also mind all that would be during outage, hardly a good time to "change it to human-readable form and edit out the unneeded stuff. [...] do a FULL vacuum, analyse and reindex of the databases".
      
      Parent
    - Re:We're in the same boat here (Score: 0) by Anonymous Coward on Friday February 03 2017, @08:34PM
      
      by Anonymous Coward on Friday February 03 2017, @08:34PM (#462581)
      
      Forgot to mention: https://dev.mysql.com/doc/refman/5.7/en/mysqldump.html#option_mysqldump_flush-logs [mysql.com] can be helpful to cut down redundant. replays
      
      Parent
Re:We're in the same boat here Re:We're in the same boat here (Score: 0) by Anonymous Coward on Friday February 03 2017, @03:43PM

by Anonymous Coward on Friday February 03 2017, @03:43PM (#462437)

I'm not an expert, but wouldn't best practice be to keep a log file with all SQL transactions where delete commands can be manually undone?

Parent
- Re:We're in the same boat here (Score: 2) by pkrasimirov on Friday February 03 2017, @04:22PM
  
  by pkrasimirov (3358) on Friday February 03 2017, @04:22PM (#462457)
  
  It was rm -Rvf, not DELETE FROM.
  
  Parent
Re:We're in the same boat here (Score: 1) by hopp on Friday February 03 2017, @05:38PM

by hopp (2833) on Friday February 03 2017, @05:38PM (#462493)

Loss is a part of life. You can't take it with you data included.
Take reasonable steps to protect the data knowing that guaranteed complete recovery is improbable or prohibitively expensive.

Parent

Moderator Help

SoylentNews

SoylentNews is people

Navigation

Sections

SoylentNews

GitLab Admin Deletes Database, 6 Hours of Data Lost

We're in the same boat here We're in the same boat here (Score: 4, Informative) by The Mighty Buzzard on Friday February 03 2017, @11:39AM

Re:We're in the same boat here Re:We're in the same boat here (Score: 2) by pkrasimirov on Friday February 03 2017, @03:05PM

Re:We're in the same boat here Re:We're in the same boat here (Score: 2) by The Mighty Buzzard on Friday February 03 2017, @03:35PM

Re:We're in the same boat here Re:We're in the same boat here (Score: 0) by Anonymous Coward on Friday February 03 2017, @04:24PM

Re:We're in the same boat here (Score: 2) by pkrasimirov on Friday February 03 2017, @04:50PM

Re:We're in the same boat here (Score: 0) by Anonymous Coward on Friday February 03 2017, @08:34PM

Re:We're in the same boat here Re:We're in the same boat here (Score: 0) by Anonymous Coward on Friday February 03 2017, @03:43PM

Re:We're in the same boat here (Score: 2) by pkrasimirov on Friday February 03 2017, @04:22PM

Re:We're in the same boat here (Score: 1) by hopp on Friday February 03 2017, @05:38PM