Microsoft AI team accidentally leaks 38TB of private company data:
AI researchers at Microsoft have made a huge mistake.
According to a new report from cloud security company Wiz, the Microsoft AI research team accidentally leaked 38TB of the company's private data.
38 terabytes. That's a lot of data.
The exposed data included full backups of two employees' computers. These backups contained sensitive personal data, including passwords to Microsoft services, secret keys, and more than 30,000 internal Microsoft Teams messages from more than 350 Microsoft employees.
So, how did this happen? The report explains that Microsoft's AI team uploaded a bucket of training data containing open-source code and AI models for image recognition. Users who came across the Github repository were provided with a link from Azure, Microsoft's cloud storage service, in order to download the models.
One problem: The link that was provided by Microsoft's AI team gave visitors complete access to the entire Azure storage account. And not only could visitors view everything in the account, they could upload, overwrite, or delete files as well.
[martyb ed. update: My first hard disk drive was a Seagare ST-231. It could store so much data that I had to partition it into two "devices" under Microsoft DOS 3.2: 32MB and 8MB. It was so large that I thought that nobody would be able to use all that disk space! Over time, newest drives has had: 80MB, 200MB, and 1TB. My current PC has a 2TB drive... and that is relatively "small" by today's standards. Microsoft lost 38TB?!]
How large were your drives over time?
Related Stories
Microsoft is working with media startup Semafor to use its artificial intelligence chatbot to help develop news stories—part of a journalistic outreach that comes as the tech giant faces a multibillion-dollar lawsuit from the New York Times.
As part of the agreement, Microsoft is paying an undisclosed sum of money to Semafor to sponsor a breaking news feed called "Signals." The companies would not share financial details, but the amount of money is "substantial" to Semafor's business, said a person familiar with the matter.
[...] The partnerships come as media companies have become increasingly concerned over generative AI and its potential threat to their businesses. News publishers are grappling with how to use AI to improve their work and stay ahead of technology, while also fearing that they could lose traffic, and therefore revenue, to AI chatbots—which can churn out humanlike text and information in seconds.
The New York Times in December filed a lawsuit against Microsoft and OpenAI, alleging the tech companies have taken a "free ride" on millions of its articles to build their artificial intelligence chatbots, and seeking billions of dollars in damages.
[...] Semafor, which is free to read, is funded by wealthy individuals, including 3G capital founder Jorge Paulo Lemann and KKR co-founder Henry Kravis. The company made more than $10 million in revenue in 2023 and has more than 500,000 subscriptions to its free newsletters. Justin Smith said Semafor was "very close to a profit" in the fourth quarter of 2023.
Related stories on SoylentNews:
AI Threatens to Crush News Organizations. Lawmakers Signal Change Is Ahead - 20240112
New York Times Sues Microsoft, ChatGPT Maker OpenAI Over Copyright Infringement - 20231228
Microsoft Shamelessly Pumping Internet Full of Garbage AI-Generated "News" Articles - 20231104
Google, DOJ Still Blocking Public Access to Monopoly Trial Docs, NYT Says - 20231020
After ChatGPT Disruption, Stack Overflow Lays Off 28 Percent of Staff - 20231017
Security Risks Of Windows Copilot Are Unknowable - 20231011
Microsoft AI Team Accidentally Leaks 38TB of Private Company Data - 20230923
Microsoft Pulls AI-Generated Article Recommending Ottawa Food Bank to Tourists - 20230820
A Jargon-Free Explanation of How AI Large Language Models Work - 20230805
the Godfather of AI Leaves Google Amid Ethical Concerns - 20230502
The AI Doomers' Playbook - 20230418
Ads Are Coming for the Bing AI Chatbot, as They Come for All Microsoft Products - 20230404
Deepfakes, Synthetic Media: How Digital Propaganda Undermines Trust - 20230319
(Score: 3, Insightful) by looorg on Sunday September 24 2023, @02:05PM
They sort of left that line down towards the end. So there wasn't a one time accidental leak of 38TB? It's been trickling out for about three years. Without anyone apparently noticing or paying it much attention. Someone probably have, just not Microsoft.
(Score: 5, Funny) by kazzie on Sunday September 24 2023, @02:13PM (4 children)
If a cloud has a leak, is it raining?
(Score: 2) by BsAtHome on Sunday September 24 2023, @02:15PM (1 child)
No, it evaporates.
(Score: 2) by RS3 on Monday September 25 2023, @08:30PM
Like virga [metoffice.gov.uk].
(Score: 3, Funny) by crm114 on Sunday September 24 2023, @03:09PM (1 child)
I think this YouTube video explains the process well.
https://www.youtube.com/watch?v=ApQlMm39xr0 [youtube.com]
(Score: 0) by Anonymous Coward on Sunday September 24 2023, @10:31PM
That's funny. I'm surprised you haven't gotten downmodded for spreading "misinformation".
(Score: 4, Insightful) by Gaaark on Sunday September 24 2023, @02:27PM (2 children)
From the company that wants "Developers, developers, developers" but needs "Security, security, security"
--- Please remind me if I haven't been civil to you: I'm channeling MDC. ---Gaaark 2.0 ---
(Score: 3, Insightful) by Anonymous Coward on Sunday September 24 2023, @04:32PM
... but needs "competence, competence, competence"
FTFY
(Score: 2) by RS3 on Monday September 25 2023, @09:03PM
I'm one of the biggest MS detractors I know, but I blame the market more. It's the non-technical dolts, who have power and control over companies and spending, who buy MS OSes. MS's marketing know this and they cater to the dolts' whims, features, gadgets, bells, and whistles. Sure, many features save time and steps, and increase productivity. But so does getting rid of locks, keys, and passwords.
The problem is a bit of a cat and mouse game. Company buyer gets all giddy over features and buys MS stuff. MS, like all companies, feels the need to satisfy customer, get product out as quickly as possible, bypass security, quality control, and testing, worry about that stuff later. Customer meanwhile assumes, if they even think about it, that MS has done good job of quality and security. Of course later security holes are revealed, and customer blames MS (as we all do).
But now customer is entrenched in MS, and moving to Linux (or anything else) is too costly (remember the dolts make these decisions), so the vicious cycle continues.
(Score: 1) by Runaway1956 on Sunday September 24 2023, @03:40PM (6 children)
My first hard drive was in a 386 which I picked up at an estate sale. 40 MB. Prior to that, everything I ever saved, loaded, or purchased was on cassette tape, or 3 1/2" floppy drives, or a cartridge that slid into a game console. It was a long time before I worried about more space. Computer also had a CD reader in it, which prompted me to purchase some games on CD. There weren't a whole lot of those available at the time, mostly bootleg stuff at the flea markets.
“I have become friends with many school shooters” - Tampon Tim Walz
(Score: 2) by looorg on Sunday September 24 2023, @04:05PM (3 children)
How large was your drive over time? Turns out it was never large enough. You connected it, formatted it and looked in awe at the first once. How was I ever going to fill that. Yet as soon as you got it then it rapidly started to fill up. Doesn't matter if it was megabytes, gigabytes or terabytes. As soon as it's connected it start to fill up. The more storage you have, the more data is collected.
The first one was a A590 20MB SCSI drive, the next one was a 120MB IBM, recycled from a machine at work, then it was 500MB, or was it 600MB drive from Conner, then they just become a blur of drives and machines. Today the desktop machine have 3TB (1+2) plus about 2TB more of drives in the machine that are not powered/connected. About 1/3 of it's free.
(Score: 2) by Gaaark on Sunday September 24 2023, @07:01PM (2 children)
My plex server has, if memory serves, 1x2TB, 1x3TB and 1x8TB. All are pretty well full (the 8TB has, i believe, only 2TB of space left).
It's like home life: the more storage space you have, the more crap you collect.
*Note to self: clear out some crap...
--- Please remind me if I haven't been civil to you: I'm channeling MDC. ---Gaaark 2.0 ---
(Score: 2) by Unixnut on Sunday September 24 2023, @10:37PM (1 child)
Oooh, my first hard drive was 200MB, I don't remember the manufacturer, but I remember it saying "Made in Scotland", which surprised me because that was the only piece of computer technology I had seen (and seen since) that was made in the UK.
Fast forward now, my largest drives are 10TB, and I got 4 of them in an raid-z2 array. I bought them in 2018 as I got tired of running out of space every 2 years and having re-buy four drives each time to grow the array. It did the job, as the 20TB of usable space + ZFS compression means I am not even at 50% utilisation 5 years in (currently 9.69T on disk, including zfs snapshots).
My storage consumption seems to have plateaued, I always thought I could just collect crap until I filled up anything, but at some point I think the overhead of managing, curating or even knowing what you have renders accumulating more data moot.
I've dug down into my server and found files I had not touched in 10+ years and that I forgot I even had. Hell I've dug down and found files I had never even seen before and had no idea how they got there (well I've got a hint of how they got there: back in my student days I gave all my then friends ssh access and public samba shares so they could back up their data to the server. A lot of stuff was accumulated over time).
I guess 10TB is roughly my limit at this juncture, and I've been trying to sort out what I have and clear out what is not needed anymore. Just finding the time alone to do it is hard.
(Score: 2) by Gaaark on Sunday September 24 2023, @10:45PM
I hear this, loud and clear.
--- Please remind me if I haven't been civil to you: I'm channeling MDC. ---Gaaark 2.0 ---
(Score: 3, Interesting) by Gaaark on Sunday September 24 2023, @04:37PM
My Acorn Atom had, from memory, 8k of ram and that was it. Never did get that tape drive to work. :(
:)
--- Please remind me if I haven't been civil to you: I'm channeling MDC. ---Gaaark 2.0 ---
(Score: 2) by Opportunist on Sunday September 24 2023, @04:49PM
Same. I later dropped quite a bit of money on a 100mb drive. That lasted for what felt like an eternity and never came even close to filling.
And then the first Gigabyte drive hit the market. Fuuuuck, a whole GIG of space. You'll NEVER EVER fill that! Not in a million years!
(Score: 0) by Anonymous Coward on Sunday September 24 2023, @04:03PM
The last iterations I have recollection of went from 15 GB to 400 GB to >60 TB.
(Score: 4, Informative) by Opportunist on Sunday September 24 2023, @04:46PM
cloud, n, English, fluffy looking puff of vapor
klaut, verb, German, homonym to cloud, imperative plural to "klauen", a command to a group to steal things.
(Score: 0) by Anonymous Coward on Sunday September 24 2023, @05:10PM
The breach was responsibly disclosed to Microsoft instead of spread around the Internet as it should be.
(Score: 2) by progo on Sunday September 24 2023, @06:01PM
When I was a teenager my dad helped me put together a desktop computer from parts from a computer store. A 386DX motherboard and CPU, a Gravis Ultrasound, some kind of super-VGA card probably, and 2 used but good "full-height" 150MB hard drives.
The amount of data I'm holding onto, and never got around to properly tagging, has been increasing steadily ever since to fill approximately one comfortably priced consumer storage device.
(Score: 3, Interesting) by SomeGuy on Sunday September 24 2023, @07:53PM (3 children)
"internal Microsoft Teams messages"
So in other words, nothing of any value was leaked.
Since everyone is going on about hard drive, my first hard drive was only 5mb. It was pulled form a surplussed Apple II Corvus system and then stuffed in an IBM PC clone. Got a little more space out of it by using an RLL controller.
(Score: 5, Interesting) by RS3 on Sunday September 24 2023, @11:04PM (2 children)
Very interesting that you got a 5 MB drive to format RLL. Was it reliable?
I have a 5 MB drive but never used it and don't remember where I got it. I was quite a tinkerer in the very late 80s into 90s. My first computer was an AT&T 6300 with 20MB Seagate ST225, that failed within a week or so. Got it replaced, but had NO idea what "low-level format" was or how to do it. Got helps and got it done.
Fairly soon got a copy of SpinRite. It was a lifesaver. Back then you could low-level format drives, and SpinRite would correctly find and mark bad sectors. I never had any drive data losses or problems when I ran SpinRite every now and then. BTW, it never agreed with the "defect map" on the drives.
Long history shortened- I discovered RLL and that some drives formatted RLL reliably, but some not at all. I was a determined tinkerer in those days. I had a full-height 40MB MFM drive (I forget the make, maybe Maxtor) that wouldn't go RLL. Being an EE, interested in magnetic storage, circuits, etc., I reverse-engineered the head amplifier circuits, discovered that the filter cut off high frequencies too soon, changed a couple of parts (maybe just one filter cap), and got the darned thing to format RLL 100% reliably. I got a 50% faster, full 60MB, and was super happy.
Around then I got lucky and got 3 Priam 330MB ESDI drives cheap at a flea market. 2 worked off the bat. Third had a dent but was probably okay. They were very fast and HUGE. IDE drives were coming out but were super slow. It was many years before IDE drives could keep up with the big Priam ESDI drives.
(Score: 2) by SomeGuy on Monday September 25 2023, @12:31AM (1 child)
Surprisingly, yes, it was very reliable with RLL. Tested A-OK with Spinrite - that is the best way to be sure RLL is working. It was an IMI full height drive.
I'm not certain any vendor ever actually sold 5mb hard drives for IBM PCs. It seemed like the minimum anything supported was 10mb. Fortunately, that controller let me use dynamic configuration.
(Score: 2) by RS3 on Monday September 25 2023, @05:31PM
IMI- I "feel" like I have an IMI somewhere, but not looking for it now. You inspired me to grab another oldie- a CMI 5410-C 8 MB drive. I've never tried it, nor want to. Heads are free (not stuck), platters turn smoothly.
Yeah, I'm not sure what was available on early IBM PCs and ATs. Probably a pretty crazy time with 3rd-party stuff popping up.
I just stumbled onto this interesting site:
https://forum.vcfed.org/index.php?threads/computer-memories-inc-hard-disk-collection-incomplete.1241088/ [vcfed.org]
(Score: 5, Insightful) by stormreaver on Sunday September 24 2023, @10:07PM (1 child)
This must be fake news, as cloud computing is the most secure form of computing ever conceived. The amazing geniuses that run the cloud are so much smarter than we are, and are so skilled that data leaks will be a thing of the past.
Worry not, dear readers, as you will soon wake up and realize that you dreamt the whole thing.
(Score: 4, Funny) by krishnoid on Sunday September 24 2023, @10:37PM
But if they were properly protected, they should have been able to prevent conception in the first place.
(Score: 4, Funny) by dwilson98052 on Monday September 25 2023, @12:48AM (1 child)
...there are a LOT of really really really stupid people there that barely know how to click a mouse.
(Score: 4, Funny) by looorg on Monday September 25 2023, @10:28AM
That is what happens when you un-bundle MS Solitaire from the installation ...
(Score: 3, Funny) by hendrikboom on Monday September 25 2023, @01:13AM (3 children)
It wasn't me. I cannot steal that many data, having no place to put it. I have just a 4 terabyte disk drive in my computer. And it has only about 2T free.
(Score: 3, Funny) by turgid on Monday September 25 2023, @07:01AM (2 children)
You could send it to /nev/null?
I refuse to engage in a battle of wits with an unarmed opponent [wikipedia.org].
(Score: 2) by hendrikboom on Wednesday September 27 2023, @01:16AM (1 child)
/dev/null maybe.
/nev/null would likely take disk space.
(Score: 2) by turgid on Wednesday September 27 2023, @06:55AM
That would be one of my lysdexic tripos. Plus typing on a phone.
I refuse to engage in a battle of wits with an unarmed opponent [wikipedia.org].
(Score: 4, Interesting) by Rosco P. Coltrane on Monday September 25 2023, @01:46AM (6 children)
when the news came out. They still don't care.
My small company decided to wallow in Microsoft cloud services a few months ago. So now instead of having our own email server and our own IRC server for internal communication like we did for the past 40 years, we have a microsoft email server and Teams. And we use Sharepoint to send sensitive files to customers in lieu of our internal staging area. Etc Etc...
Why you ask? Because one of the two IT guys in the company is retiring, the other one isn't far from retirement, and the bean counters have decided it was cheaper to outsource our IT than hire new IT managers. Nevermind that we've had all these services and more in place for the past 35 years, they work perfectly fine and they've never been compromised.
I've tried to warn the powers that be that going with Microsoft was bad news: not only is it not sensible to entrust our company's data - the company's crown jewels - to a US company whose business it is to profit from other people's data and share it freely with US three-letter agencies, Microsoft is also notoriously incompetent when it comes to security.
So I sent that link to our officers to prove my point once again: if Microsoft can't even secure their own data, what makes you think they'll take good care of ours?
No reaction. They don't give a shit.
Althought that may not be completely true: I have a feeling our CEO is receptive to my argumentation because I had the feeling he did have second thoughts about the move to go with Microsoft when I spoke with him the other day at the coffee break. But I also had the feeling that we were now invested so deep with Microsoft that we're now in a state of vendor lock-in, and it would be way too much work to backpedal.
Really sad...
(Score: 2) by janrinok on Monday September 25 2023, @06:16AM
This! Bosses would rather deny that they might have made a bad decision so they will simple ignore the problem and suggest that the local 'experts' don't know as much as they do.
In most businesses the bosses employ their own staff because it is cheaper in the long term than giving all the skillsets that their company relies on to an outside source. Many western nations have given away their own skills and manufacturing capablities to other nations - and they are now discovering that they are far more dependent on other countries than they should be. But as many have commented in the last few months it is all about short-term profits and not long term business growth. If you think about it they have already admitted their mistake. In your particular case they are now saying that those people that they employed to manage IT many years ago was the result of a bad decision and that they should have outsourced much sooner. So either they were wrong then or they are wrong now. And as this is all about a current security breach I think it is pretty clear which decision is the one they should be apologising for and trying to undo.
People seem much more reluctant nowadays to admit mistakes. We all make them - they are a part of life. Perhaps we are becoming more like oriental races where 'face' is considered far more important than it should be.
I am not interested in knowing who people are or where they live. My interest starts and stops at our servers.
(Score: 2) by PiMuNu on Monday September 25 2023, @08:23AM (1 child)
> Sharepoint
Sh1tpoint FTFY
(Score: 2) by coolgopher on Monday September 25 2023, @10:44AM
>> Sharepoint
> Sh1tpoint
Sharepain
(Score: 2) by bloodnok on Monday September 25 2023, @05:45PM (2 children)
I've been there and it sucks. Much management simply doesn't get it. The attitude seems to be: "it hasn't happened here". And there is often misplaced trust in big name corporations at the C level.
In fairness, at the C-level level there is much other stuff to worry about. Like covering their asses, counting their bonuses and ensuring that messages of doom do not affect those bonuses.
If it helps, you've done the right thing by warning them. They will think you're a doom-monger, but you've done the professional thing.
For me, I got to the point where all I cared about was that the message was heard. I stopped giving a shit about the company's security and future and only about my role in it. I knew I could live with the fact that I'd warned them, and I always made detailed notes that I'd done so.
__
The major
(Score: 2) by Rosco P. Coltrane on Monday September 25 2023, @08:24PM (1 child)
I don't give a rat's ass about the company's security. I'm doing this for me: when my company shares its data with Microsoft, it also shares a sizeable amount of data on me with Microsoft, because I work here and employers hold a surprising amount of very personal information about their employees.
For example, the company wanted an internal facebook, so people knew who was who. They hired a photographers to take everybody's photo and upload them to - guess where - Microsoft servers. I refused. It took me weeks to finally get my company to relent and put an X on my Microsoft profile photo. And that's just one photo! Someone who's privacy conscious and still tries to go against the gigantic tide of corporate surveillance has a very difficult life in 2023, and has to fight over really silly thing inch by inch. This sort of nonsense woulc have been unthinkable only 25 years ago.
And I'm not deluding myself: Microsoft didn't get my photo, but they probably got all the details about my disability when HR's files on me got transfered from our servers to Microsoft's. And my address, and my CV and whatever else I would never give Microsoft willingly in a millions years. I mean if I really, truly didn't want that to happen, I guess the only way would have been to resign, and I like my job. Not to mention, if I found employment somewhere else, you can bet your shiny dollar they too use Microsoft's services, Or Amazon's. Or Google's... So the only way to escape Big Data these days is to stop working, and I can't afford that.
But at least Microsoft didn't get my photo...
(Score: 3, Touché) by isostatic on Tuesday September 26 2023, @06:38AM
Wouldn’t it be more user friendly to have the “photo” be a QR code of your public key
(Score: 3, Interesting) by jman on Monday September 25 2023, @02:18PM
Have a Truenas Scale box these days with a few WD 16TB drives on a RaidZ1. Have used about 1/3 of the nearly 35TB capacity, but have de-duplication turned on. Need to clean up one of these days; even with scads of DVD's and CD's can't really *need* all of whatever's strewn about on those drives. Part of all that though is shares for a couple of boxes using Time Machine.
Should also really get around to building another for off-site backup, but those big drives are still expensive! A WD Gold 22TB runs > $500. You can't build a NAS with just one. Well, you *can*, but it kinda defeats the purpose.