Arthur T Knackerbracket has processed the following story:
Do you have your VMware ESXi hypervisor joined to Active Directory? Well, the latest news from Microsoft serves as a reminder that you might not want to do that given the recently patched vulnerability that has security experts deeply concerned.
CVE-2024-37085 only carries a 6.8 CVSS rating, but has been used as a post-compromise technique by many of the world's most high-profile ransomware groups and their affiliates, including Black Basta, Akira, Medusa, and Octo Tempest/Scattered Spider.
The vulnerability allows attackers who have the necessary privileges to create AD groups – which isn't necessarily an AD admin – to gain full control of an ESXi hypervisor.
This is bad for obvious reasons. Having unfettered access to all running VMs and critical hosted servers offers attackers the ability to steal data, move laterally across the victim's network, or just cause chaos by ending processes and encrypting the file system.
The "how" of the exploit is what caused such a stir in cyber circles. There are three ways of exploiting CVE-2024-37085, but the underlying logic flaw in ESXi enabling them is what's attracted so much attention.
Essentially, if an attacker was able to add an AD group called "ESX Admins," any user added to it would by default be considered an admin.
That's it. That's the exploit.
[...] Broadcom said in a security advisory that it already issued a patch for CVE-2024-37085 on June 25, but only updated Cloud Foundation as recently as July 23, which is perhaps why Microsoft's report only just went live.
Jake Williams, VP of research and development at Hunter Strategy and IANS faculty member, was critical of Broadcom's approach to security, especially with regard to the severity it assigned the vulnerability.
[...] "I can only conclude Broadcom is not serious about security. I don't know how you conclude anything else. Oh also, there are no patches planned for ESXi 7.0."
Many commentators have questioned why an organization would join their ESXi hosts to AD in the first place, despite it being a relatively common practice.
"Why are ESX servers joined with an active directory in the first place? Because it is convenient to manage admin access to servers using a centralized platform in large corporations," Dr Martin J Kraemer, security awareness advocate at KnowBe4, told The Register.
"This is very common but also creates challenges. In many environments, the AD itself might run on a VM. Cold boot can be a nightmare. A chicken and egg problem. How can you start ESX without AD while AD runs on ESX? Admins must think about this. A well-known challenge.
[...] "Over the last year, we have seen ransomware actors targeting ESXi hypervisors to facilitate mass encryption impact in few clicks, demonstrating that ransomware operators are constantly innovating their attack techniques to increase impact on the organizations they target," it said.
Microsoft also said that ESXi hypervisors often fly further under the radar in security operations centers (SOCs) because security solutions often don't have the necessary visibility into ESXi, potentially allowing attackers to go undetected for longer periods of time.
Because of the destruction a successful ESXi attack could cause, attacks have risen sharply. In the past three years, the targeting of ESXi hypervisors has doubled.
[...] Microsoft recommends that all ESXi users install the available patches and scrub up their credential hygiene to prevent future attacks, as well as use a robust vulnerability scanner, if you don't already.
(Score: 2, Insightful) by Anonymous Coward on Sunday August 04 2024, @02:46AM (1 child)
As for chicken and egg you might have more than one ESXi host. Go figure.
Is this another of those "OMG if attacker have root they can get root" exploits? Seems common nowadays among "security researchers". Sure it's a weakness in some scenarios but they tend to write the stuff to overplay the real world impact.
(Score: 4, Interesting) by VLM on Sunday August 04 2024, @04:26PM
It's a delegation problem. If, out of the box ESXi defaults to giving "ESX Admins" root, which from memory I think it did (its been a year or two), then the sysadmins have to make two mistakes at the same time:
1) Fail to configure a ESX Admins group in AD. Admittedly vmware could work around this by "blowing up" at install if the group doesn't already exist. I have to admit, when I set this up, I set up a ESX Admins group because why would I be connecting my ESXi hosts to AD unless I wanted to auth using AD and they helpfully preconfigure (IIRC) the ESX Admins group, so I used that. Worked great IIRC.
2) Delegate to some minor functionary, perhaps as minor as any employee or contractor on the help desk, the ability to create groups "you know for mailing lists and stuff like that" which seems quite harmless until they create a new group named "ESX Admins" and thus obtain root/admin on all the ESXi hosts.
I wonder how dumb/simple the query is on ESXi hosts. Could I create an AD group named "EsX aDmInS" or maybe use weird unicode and 50% of the time it would auth against my group instead of the real one? That would be funny.
(Score: 2, Informative) by Anonymous Coward on Sunday August 04 2024, @02:54AM (7 children)
You don't. AD is already running.
Unless you're asking: "How do you start an Active Directory domain when there are no domain controllers?"
You don't. You create one.
There *must* be at least one domain controller running at all times. To not have that is a flaw. It's not supported. You're in the realm of "Pay MS $500 for a support ticket to get me out of the shit-fest I've found myself in."
But yeah yeah, this is totally unrelated to the article -- which is about giving unexpected admin privileges to any domain users who are able to create AD groups and add members to them. (Oh, I dunno, maybe any other existing AD group controling ESX admin access. Oh there isn't one? Why is the ESX server joined to the domain, again? How is its admin password controlled? By a password-manager that uses domain authentication. Gotcha.)
So... what?
What's the problem here?
Oh. Another click-bait article by a wants-to-be-famous security-someone. Yeap. Ok.
(Score: 5, Informative) by AlwaysNever on Sunday August 04 2024, @08:09AM
The problem is that AD is fundamentally insecure because of how it is run in the real world. And if you have you vCenter joined to the AD, then not only the members of the Domain Admins group have root on your ESXi infrastructure, but also any regular user who lands in the unexpected "ESX Admins" group.
I've seen the Domain Admin credentials stolen and all Windows Servers encrypted, with only the Linux serves standing. With this new approach, if the Linux servers are VMs then they won't be spared any longer in those kinds of attacks.
(Score: 3, Interesting) by Anonymous Coward on Sunday August 04 2024, @02:21PM (2 children)
I had the misfourtune of using ESX once for about a week while I installed open source tools and got the hell away from that steaming pile of garbage....on one hand I hope that's what you're talking about when you say it's a "critical" flaw to have zero DCs running...but then you say you have to pay Microsoft for support if you find yourself in that situation...
We regularly test that exact situation. Last weekend in fact. We shut down ~70 Windows VMs across several states--all of them domain controllers. Then we shut down the underlying (non-vmware, open source) infrastructure. Then we powered off the hypervisors.
We all had a cup of coffee and BS'd for a few minutes, then we rolled one domain controller back by 15 minutes to test our disaster recovery plan and brought everything back online.
Zero problems.
You can totally "black start" AD. If you couldn't, AD would be worthless.
If you're specifically talking about ESX being depending on AD to start...I have my doubts...but I don't really have any experience with it, so I can't say for certain....but it sounds like you would pay vmware/broadcom for support, not Microsoft.
(Score: 2) by VLM on Sunday August 04 2024, @04:13PM (1 child)
It would be hard to install initially if you couldn't install one machine first.
The main problem I can see is what if you partition, and one half cluster thinks they have the FSMO roles and the other half of the cluster ALSO think they have the FSMO roles and then connecting them back together could be exciting. Like a partial upgrade of not all servers at the same time, or a very exciting network outage.
(Score: 0) by Anonymous Coward on Sunday August 04 2024, @09:33PM
I don't think FSMO roles work like that. They don't bounce around the network from DC to DC based on seeing > n+1 DCs. You have one machine that handles a particular FSMO role, and if it's offline for a short time it's not really a big deal. For example and RID master simply hands out a bunch of IDs to every other DC on the network for use. If it goes down for an hour or a week it's not a problem until your DCs start running out of RIDs.
That's kind of the whole point of the multi-master replication setup. There's no one single point of failure.
And even if your FSMO holder has something really bad happen that takes it offline permanently, you can just "seize" the FSMO roles and transfer them to another working DC.
If I recall correctly, the old FSMO holder shouldn't be brought back online ever again or bad stuff might happen.
(Score: 3, Interesting) by VLM on Sunday August 04 2024, @04:09PM (2 children)
Having used both technologies, I can't make much sense out of the articles claim.
Maybe, if enough ESXi hosts can't reboot after a power failure, because I donno why, then it could be fixed by logging in as an individual host admin to an individual ESXi host. But I'm struggling to figure out how to do that as a guy who's recovered from situations like that... Maybe if you really F-ed up the VLANs and network connections it might be simpler to fix the host to match than to fix the attached networking gear to work; that's all I can think of at this time. Like if lightning or similar zapped your ethernet switches it might be easier to configure the ESXi hosts to talk over a replacement switch than to configure a replacement switch to match the burned out switch if you did something bizarre. This kind of thing is why vmware and proxmox and probably others suggest many interfaces as simple as possible. Of course if you had vSAN, management, vMotion, and production traffic, and four switches got vaporized and you only got two ready to go on the shelf (or worse, one), you'd need to do something creative to get it even minimally working.
It's WAY more exciting to set up your NFS access at the vCenter/ESXi level using domain names to access the NAS and then host the DNS servers on that NAS then power cycle the cluster. That would be funny. Thats why they say to always configure cluster NFS (and cluster everything, pretty much) to not use DNS, use IP addresses. Even stuff like NTP needs IP addresses; some crypto stuff gets all out of whack if times are wrong, and times will be wrong if you configure NTP using DNS and there's a major DNS outage.
Or imagine, if the admin were completely insane, I think ESXi can be configured to use DHCP assigned IP addresses (or, maybe your SAN/NAS, that would also be funny) and then host the DHCP servers as VMs on the cluster, so if the cluster ever goes down it cannot reboot again. Reality is you'd grab a linux laptop and stick a DHCP server on it to bootstrap stuff.
I don't think there is a direct way to prevent a ESXi host from rebooting if its AD connection is down; it MIGHT be possible to come up with a scenario where vCenter refuses to restart so you'd have to log into an individual ESXi host to kick it somehow, and if the AD were down it would be harder to log into the ESXi server.
The solution to this is ANCIENT in the telecom world, you have a root password on each device stored in a sealed envelope in a safe or similar that's only opened in extreme emergencies. So if your centralized auth is down you can still log into everything as root or equivalent, and afterwards the envelope being ripped open indicates passwords have to be reset on literally everything. This can be emulated on ESXi pretty easily (and also on proxmox, although proxmox has fewer single points of failure so it doesn't matter as much). I don't think you can avoid configuring ESXi like that, but poor admins could certainly toss out the non-AD admin password or otherwise change it and not record it.
(Score: 3, Interesting) by VLM on Sunday August 04 2024, @04:16PM (1 child)
Oh, I just thought up a way to crash an overall system using AD. Set up your NAS/SAN to require AD auth only to maintain it, and mess it up such that it can't operate until someone logs into it and "fixes" it or even more insane you're doing some kind of "store your NFSv4 credentials in AD" somehow. Then host your AD on a cluster that uses the NAS/SAN for storage, then reboot the entire thing at once (power failure simulation)
(Score: 0) by Anonymous Coward on Sunday August 04 2024, @09:37PM
That shouldn't be a problem unless you are strictly using Microsoft's brain-damaged technologies.
I run a crap-ton of Linux boxes that all run Samba and do virtualization. Samba on the Linux box is joined into AD that is running in a VM.
When the boxes boot, Samba can't talk to AD so you have no fileshares....then you boot the VM and Samba can start authenticating you.
Even if you were brain damaged enough to set the Linux box to use domain authentication for things like SSH or web GUI access...you can still easily get in to Linux to disable the stupidity to get things booted back up.