Claude Code deletes developers' production setup, including its database and snapshots — 2.5 years of records were nuked in an instant
Story has a happy ending of sorts, but should serve as a cautionary tale.
Everyone loves a good story about agent bots gone wrong, and those often come with a bit of schadenfreude towards our virtual companions. Sometimes, though, the errors can be attributed to improper supervision, as was the case of Alexey Grigorev, who was brave enough to detail how he got Claude Code to wipe years' worth of records on a website, including the recovery snapshots.
The story begins when Grigorev wanted to move his website, AI Shipping Labs, to AWS and have it share the same infrastructure as DataTalks.Club. Claude itself advised against that option, but Grigorev considered it wasn't worth the hassle or cost of keeping two separate setups.
Gregory uses Terraform, an infrastructure management utility that can create (or destroy) entire setups, including networks, load balancing, databases, and, naturally, the servers themselves. He had Claude run a Terraform plan to set up the new website, but forgot to upload a vital state file that contains a full description of the setup as it exists at any moment in time.
[Source]: Tom's Hardware
Have any of you been in a similar situation ? and, if yes, how did you recover your data ?
(Score: 4, Insightful) by DiarrhoeaChaChaCha on Monday March 09, @11:38AM (2 children)
No, it can always be attributed to improper supervision. LLM's don't do things unprompted.
If they delete code bases, website, databases etc. they do so because someone gave them the access to so this in the first place.
Knowing that LLM's can do stupid shit, makes these issues 100% user error.
(Score: 0) by Anonymous Coward on Monday March 09, @12:11PM
Would you like them to?
(Score: -1, Redundant) by Anonymous Coward on Monday March 09, @10:55PM
AKA Officer Barney Fife
(Score: 4, Insightful) by BsAtHome on Monday March 09, @01:07PM
Yeah, a clueless guy gave the keys to his computer to a program with a terrible track record and which he doesn't know, doesn't understand, doesn't control and doesn't own. That will make the world a better place and make him a meeeelionair, right.
Stupid see stupid does...
Sigh.
(Score: 4, Insightful) by JoeMerchant on Monday March 09, @01:38PM (9 children)
In the 90s I used to lose work. Backups went to floppy diskette stored in a fireproof box in a cabinet, they took significant time and effort, so... I didn't do them too often.
Once, it had been about a week since my last backup and a hardware fault in the hard drive ate that week of work. Of course, that was a week of exploratory experimentation and recreating it only took about a day and a half, because I had already charted the dead-ends and knew to avoid them the next time around, but still... a day and a half is a pretty significant loss (although, daily backups would probably cost a day's worth of productivity across a year, so... pick your poison.)
Through the years, backups got cheaper, tools got better, I lost less and less work at-a-shot. Lately, the most I tend to lose is when I'm typing in a web-edit form and it wipes out whatever I typed before I hit Submit or similar.
Enter Claude.
Within a month of working with Claude, Claude had built up a day's worth of work (which: a day's worth of Claude output looks like about 6 months of what I used to produce in the 90s), and somehow - trusting Claude to write its own commit messages and commit the proper files to git, Claude managed to screw it up, nuke the whole thing, nothing left, not in git, not in git stash, not in the Recycle bin, nowhere - just gone. Impressive. My fault for trusting it to do the commit work, also a bit my fault for asking it to do a complex commit process, but that's the kind of thing that these LLM models excel at: yeah, there's a fancy command where you can commit just a part of the changes in the working tree, but who can remember all that? I just commit everything in the tree, every time, and I don't think in 15+ years of working with git that I ever screwed up so spectacularly as Claude managed to in terms of total annihilation of a significant chunk of work.
Of course, that day of work with Claude also explored a lot of dead-ends, we reproduced the valuable parts of the lost work within about 90 minutes.
New working style:
For every LLM prompt which results in a successful passing of the associated unit tests: commit with a brief description of the net changes and a copy of the prompt history that led to those changes. It works well - as intended about 80% of the time - and no catastrophic losses, lately.
🌻🌻🌻🌻 [google.com]
(Score: 2) by DannyB on Monday March 09, @07:36PM (8 children)
Ah, memories of the 90's.
In 1982, I started as the 4th person in the company, and 2nd programmer. (That's what it was called in that millennium.)
After you lose things a few times, you figure out better ways.
By the 90's we had a nice central file server on our network. It was nightly backed up to tape. A person, usually me, rotated the tape to a fireproof box within a larger fireproof box. Another person (the company owner) rotated a tape weekly to/from the bank safety deposit box. There was a schedule of which tape was where, in a speadsheet, printed out, created by the owner. That way it was easy to tell where any given tape should be and the date of its most recent backup.
We felt pretty safe. If you wanted something backed up, you put it in your own personal folder on the file server. The file server had a "People" folder with a sub folder for each person, and you could store whatever you wanted to have a backup locally on the network, and before too long in the bank safe deposit box.
Stupid people exist because nothing in the food chain eats them anymore.
(Score: 2) by JoeMerchant on Monday March 09, @08:15PM (7 children)
Did you ever actually restore things from that tape?
We tried using tape as a transport media sending "big data" from 5 sites around the country to a central site. It worked about 95% of the time. After their third failure, they moved to triple backups - three tapes per transport, two mailed, one retained on site. They still had one triple failure after that. I think they probably did about 1500 transfers in all, only lost the 3 during the first 100 or so transfers, plus one later.
As rare as "hard drive crashed, need backup tape to restore" was, tape companies got away with those error rates, but apparently it was pretty similar across the industry. If you were moving 10GB on a 10GB tape, you're lucky to get 19/20 fully restored in-tact.
🌻🌻🌻🌻 [google.com]
(Score: 1) by pTamok on Monday March 09, @10:51PM (3 children)
This is why someone invented Tornado codes [wikipedia.org], and subsequently 'fountain codes': similar problem, but with downloading set-top box updates over the (satellite) air.
You do not want a firmware update to fail because a block late in the delivery sequence fails, required a re-transmission of the whole thing from the start. So you put sufficient redundancy into the data so that you can join the download at any point and get the complete dataset after receiving very slightly more data than a non-fountain-encoded dataset. It seems a bit like magic to those of use who prayed the modem would stay up long enough to download some much-wanted files.
(Score: 3, Insightful) by JoeMerchant on Tuesday March 10, @12:59AM (2 children)
It's a good scheme, but this was 1992-3 and we were using COTS gear, nobody was that sophisticated and I bet the 386 with 640K of memory would have struggled to implement a Tornado encode / decode of 10 GB...
🌻🌻🌻🌻 [google.com]
(Score: 2, Insightful) by pTamok on Wednesday March 11, @05:50PM
Heh. I started using XMODEM [wikipedia.org] a loooong time ago, and manually reformatting and concatenating Usenet postings so I could get a uudecode facility [wikipedia.org] to accept them: developments since then have made things so easy. My old self is insanely envious. Having a Gigabit network to my home, and being able to saturate it with BitTorrent or a 'simple' https file transfer is just amazing. Times have changed.
(Score: 2) by DannyB on Wednesday March 11, @07:30PM
1993 and thereabout is the same time frame I am thinking of for the servers and tape backup system that I mentioned.
Stupid people exist because nothing in the food chain eats them anymore.
(Score: 2) by DannyB on Wednesday March 11, @07:28PM (2 children)
I do not recall ever trying to restore anything from the tape. That doesn't mean it didn't happen. But I don't remember such an incident.
The backup software would be able to retrieve individual files.
As I seem to recall we would find reason to upgrade the server ever two to three years. We were predominately a classic Macintosh shop back then. We had Windows machines in order to build Windows versions of our cross platform software.
Stupid people exist because nothing in the food chain eats them anymore.
(Score: 2) by JoeMerchant on Wednesday March 11, @07:56PM (1 child)
Over the decades I've seen more than one sad face after the backup system they invested so much in (money, time, effort) failed to restore their data when they actually needed it.
🌻🌻🌻🌻 [google.com]
(Score: 3, Insightful) by DannyB on Friday March 13, @04:18PM
I have heard those stories as well. That was a long time ago.
As I got older and heard the same stories as you, I came to believe that a backup system cannot be considered reliable unless you have proven that you can actually restore.
Stupid people exist because nothing in the food chain eats them anymore.
(Score: 5, Insightful) by ElizabethGreene on Monday March 09, @03:10PM (1 child)
"developers' production setup"
History doesn't repeat itself, but it likes to rhyme.
We spent a lot of time learning that developers should not have unsupervised production access, regardless of if the dev is a 20-year veteran or an intern.
We're learning now that last bit should be "if the developer is a 20-year veteran, an intern, or a clever LLM".
(Score: 4, Interesting) by JoeMerchant on Monday March 09, @04:28PM
>We're learning now that last bit should be "if the developer is a 20-year veteran, an intern, or a clever LLM".
I think "we" knew that before LLMs even started to look clever.
What we know and what we do in practice don't always match, or even rhyme.
🌻🌻🌻🌻 [google.com]
(Score: 2) by bart on Monday March 09, @09:01PM
I've never had a problem with restoring the state file when it's missing. I've often enough had to do this. It's a non issue.
(Score: 3, Informative) by jb on Wednesday March 11, @07:42AM (5 children)
Repeat after me: if it resides on the same network as production, or is in any way automatically accessible from production, it is not a backup!
Some things never change...
(Score: 2) by DannyB on Wednesday March 11, @07:34PM (2 children)
What if production has an APPEND-ONLY access to some sort of data store?
I'm talking hypothetically of course.
For example, Production would produce a zip backup (or tar.gz or similar) then send that file to a backup storage server. Of course the production server could also keep the last N dated copies of those backup files.
That is the principle of what I mean. Append only backups from production. Just like a log that has been routed to a line printer is definitely append-only and isn't affected by EMP or electrical storms.
Stupid people exist because nothing in the food chain eats them anymore.
(Score: 2) by jb on Thursday March 12, @07:17AM (1 child)
Okay, you got me. That's a valid nit to pick. Just s/access/write access/ then the statement holds true again.
Your idea can and has been implemented with a variety of different WORM media.
My favourite example (from a very long time ago) arose from a need to protect log file integrity. The simplest solution was to stream all log output to a line printer. Worked very well, until someone figured out how much money was being spent on paper...
(Score: 2) by DannyB on Friday March 13, @04:22PM
Paper is cheap. On balance, whether it is worth it depends on how valuable the logs are.
Stupid people exist because nothing in the food chain eats them anymore.
(Score: 2) by DannyB on Wednesday March 11, @07:37PM (1 child)
One more thing . . .
If something is good and works perfectly, then it is guaranteed that someone is going to change it.
If it is broken and works poorly, nobody will bother.
That would seem to suggest a strategy of how to minimize changes.
Stupid people exist because nothing in the food chain eats them anymore.
(Score: 2) by jb on Thursday March 12, @07:19AM
Indeed. Pretty sure there's a "law" to that effect, though I can't remember whose law it was.
And that's the strategy most of the big vendors follow ... which is why the garbage they ship should be avoided like the plague.