Cloudflare says Intel is not inside its next-gen servers – Ice Lake melted its energy budget:
Cloudflare has revealed that it was unable to put Intel inside its new home-brew servers, because they just used too much energy.
A Tuesday post by platform operations engineer Chris Howells reveals that Cloudflare has been working on designs for an eleventh-generation server since mid-2020.
"We evaluated Intel's latest generation of 'Ice Lake' Xeon processors," Howells wrote. "Although Intel's chips were able to compete with AMD in terms of raw performance, the power consumption was several hundred watts higher per server – that's enormous."
Fatally enormous – Cloudflare's evaluation saw it adopt AMD's 64-core Epyc 7713 for the servers it deploys to over 200 edge locations around the world.
Power savings also influenced a decision to go from three disks to two in the new design. A pair of 1.92TB Samsung drives replaced the three of the Korean giant's 960GB units found in previous designs. The net gain was a terabyte of capacity, and six fewer watts of power consumption.
[...] "We investigated higher-speed Ethernet, but we do not currently see this as beneficial," Howells wrote. That's not a brickbat for fast Ethernet, but a decision made possible by Cloudflare's highly distributed architecture that removes the need for higher speeds and the higher cost of faster kit.
See also:
- https://hothardware.com/news/cloudflare-dumps-intel-amd-epyc-gen-x-servers
- https://www.techzine.eu/news/cloud/64718/cloudflare-chooses-amd-to-power-its-new-generation-of-servers/
(Score: 3, Insightful) by Rosco P. Coltrane on Thursday September 02 2021, @07:45PM (2 children)
CloudFlare has put a sizeable portion of the internet under surveillance, prevents people from browsing through TOR, hampers your browsing with captchas right and left, but at least they do it ecologically. That makes them a-okay I guess.
(Score: 1, Interesting) by Anonymous Coward on Friday September 03 2021, @04:50AM (1 child)
Fuck cloudflare.
Every time I see them it's because they've hijacked my URL and are there to tell me they can't service my request.
Who gave these unreliable fuckers permission to intercept my traffic in the first place?
(Score: 2) by maxwell demon on Friday September 03 2021, @03:57PM
Be glad that is the only thing you get. I sometimes get a message that there's suspicious activity (I use a non-standard browser and block a lot of stuff), and therefore I need to confirm that I'm not a robot, which of course doesn't work with my blocks; I never was desperate enough to get to the page to figure out what exactly I would have to allow. Well, those sites got a visit less each time that happened.
Those operating the web site you tried to contact.
The Tao of math: The numbers you can count are not the real numbers.
(Score: 5, Interesting) by AlienInterview on Thursday September 02 2021, @07:54PM (11 children)
Intel used to be the absolute master in the domain of microprocessor design and manufacturing when there was a consideration for energy consumption (I loved my SPARC and POWER CPUs but damn were they power hungry and hot). Intel started getting slow and weak in terms of process size at least 5 or maybe even 10 years ago and if things don't change will be overtaken by TSMC in that domain in a few years. Apple just beat Intel in some considerable metrics with their homebrew CPU with paths for lots of performance increases in the future according to some analysis I read. And those techniques can't be applied to the x86 instruction set as well so Intel can not benefit from them.
Now Intel can't even beat AMD in terms of practical computational output over power consumption? I've always known AMD to be good enough and cheap but only because it was going to be hot. Hot was fine with me.
Intel is doomed unless they can get their act together.
(Score: 3, Interesting) by DannyB on Thursday September 02 2021, @08:57PM (5 children)
I wonder what Cloudflare thinks about "management engines" that run Minix?
The Centauri traded Earth jump gate technology in exchange for our superior hair mousse formulas.
(Score: 2, Interesting) by Anonymous Coward on Thursday September 02 2021, @09:03PM (2 children)
I think they are highly questionable! Intel took it to the next level though because not only is there a management engine in all of their CPUs there is also a lights out access part to the management engine, it's hooked into the ethernet port on the motherboard, it runs it's own PHY and network stack, it constantly sits and waits for a provision packet to establish control of the system, and you can't even turn it off!
(Score: 4, Informative) by hendrikboom on Friday September 03 2021, @02:02AM (1 child)
You can somewhat turn the management engine off -- It needs to be on while booting, but once the processor is properly initialized, it is disabled on the intel processor in my Purism laptop.
The disabling technology is rumoured to have been developed at the request of the NSA, who I imagine doesn't like backdoors in their hardware.
And it is possible to design a motherboard that doesn't connect the ethernet directly to the CPU.
-- hendrik
(Score: 0) by Anonymous Coward on Friday September 03 2021, @03:04AM
That they can't control. I'm sure it's different if they are one of the few approved parties who can control it.
(Score: 0) by Anonymous Coward on Thursday September 02 2021, @09:25PM
in soviet amerika cloudflare surveils you. you don't surveil them.
(Score: 0) by Anonymous Coward on Thursday September 02 2021, @09:48PM
They probably appreciate the TPM, because it's actually useful for their purposes. In your laptop, not so much.
(Score: 2) by Username on Thursday September 02 2021, @09:39PM (1 child)
They still are the approved supplier of processors for all the defense and aerospace applications that I'm familiar with. They will at least have that.
(Score: 1) by AlienInterview on Thursday September 02 2021, @10:34PM
I don't think Intel is ever going to die for reasons like this. They are not only slow and weak now but also really fat and lazy. And they aren't the kind of fat where you have energy that's stored up so you can not eat food for a while. They are fat from parasites internally because of the amount of dead weight employees and inefficient processes. I have a friend that works at Intel and tells me stories. It is ugly.
(Score: 0, Insightful) by Anonymous Coward on Thursday September 02 2021, @11:56PM (2 children)
Intel chips have had lower performance per watt than AMD for as long as I can remember, and I've been using AMD since Cyrix went under in the '90s. The advantage of Intel has always been that their high end processors were faster than AMD's top of the line could achieve, so the lucrative high end servers, workstations, and gaming rigs all wanted Intel. Cloudflare's edge servers are solidly in bang-for-buck territory where AMD has long been the leader, so going with AMD shouldn't be a surprise.
(Score: 0) by Anonymous Coward on Friday September 03 2021, @08:10PM (1 child)
(Score: 0) by Anonymous Coward on Friday September 03 2021, @08:48PM
n/t
(Score: 2) by Freeman on Thursday September 02 2021, @08:09PM
AMD has been making in-roads in the server market. This is just another big deal going to AMD, instead of Intel. It's nice to see AMD doing well, but we definitely don't want a monopoly, like Intel used to have. Which they still have to some extent. I've been using AMD since I built my first computer, because bang for the buck. Now, it's less about bang for the buck as opposed to, when can I get parts at normal prices, thanks. AMD is knocking it out of the park, but their success has been hampered somewhat by the pandemic. Then again, all the chip manufacturers seem to be behind the 8 ball on this one.
Joshua 1:9 "Be strong and of a good courage; be not afraid, neither be thou dismayed: for the Lord thy God is with thee"
(Score: 2) by krishnoid on Thursday September 02 2021, @08:35PM (9 children)
From TFA:
I'm surprised that more RAM wouldn't help, but ~10% (?) faster RAM did -- I suspect that the improved SSD throughput made most of the difference in improving disk->network performance. 50-gigabit Ethernet per caching server is certainly respectable enough.
(Score: 4, Interesting) by Runaway1956 on Thursday September 02 2021, @11:16PM (4 children)
If your workload doesn't fill up the RAM, then more RAM won't help speed things up. I've never seen any guideline, but offhand, I would say that you want to keep your ram at ~80% utilization. You want some extra for peak demand, but any more than that is wasted money. However, you gotta key into part of your quoted text.
It seems that more RAM maybe helped things a little bit, but the metrics didn't justify spending that much money. Or, to oversimplify, a .05% gain in performance won't justify 6% more expenditure.
That said, it really shouldn't surprise anyone that faster memory will improve performance. Faster is almost always better, right?
Anecdotally, I've recently upgraded my own memory to ~10% faster memory. I ran benchmarks before and after upgrade. All benchmarks improved incrementally after upgrade, some more than others. Overall, performance seems to have improved slightly. If I were the IT guy in charge of fifty machines, I don't think I would repeat the upgrade for all of them. However, I would think it justified to spec 50 new machines with faster memory. Thousands of machines? Maybe I would just recommend it for more critical machines, and not all of them.
“I have become friends with many school shooters” - Tampon Tim Walz
(Score: 3, Interesting) by krishnoid on Friday September 03 2021, @12:00AM (1 child)
As of Windows 7 and Linux since time immemorial, I believe any RAM not explicitly used by the processes and kernel is available for disk caching (especially read-caching). If you use up 100% of your RAM and need more for an application, the kernel can just that they're no longer part of the filesystem cache, and re-virtual-memory-map those pages to a process that needs them, probably without even clearing them.
(Score: 1, Informative) by Anonymous Coward on Friday September 03 2021, @02:09AM
Linux keeps a small pool of blank pages to speed up anonymous memory allocation, and unless you've disabled the security features at compile time it will never pass dirty pages to a process. Otherwise, you are correct.
(Score: 2) by krishnoid on Friday September 03 2021, @12:05AM (1 child)
True, I think we're at a point where performance-primary systems benefit from faster RAM being able to keep up with modern SSD performance and their corresponding higher-speed connections -- the worm has turned.
(Score: 0) by Anonymous Coward on Friday September 03 2021, @02:15AM
RAM is still orders of magnitude faster than SSDs, even using NVMe. The performance boost is more likely either from the ethernet cards maxing out memory bandwidth or whatever processing their server software needs to do to fill requests.
(Score: 0) by Anonymous Coward on Friday September 03 2021, @12:08AM (2 children)
These are largely dumb file servers so increasing RAM doesn't give nearly the boost that it would for a task server. Frankly I'm surprised that increasing RAM speed made enough of a difference to be worth it, since file servers are normally IO bound. That is also why they can get away with using cheaper CPUs, since file servers shouldn't ever be CPU bound.
(Score: 2) by hendrikboom on Friday September 03 2021, @02:07AM (1 child)
They may be serving the same content to many clients. No point fetching it anew from disk for each client.
(Score: 0) by Anonymous Coward on Friday September 03 2021, @08:44PM
That is why caching has any benefit at all, but cache hits tend to follow a Gaussian distribution so the first quarter is always more beneficial than the last half.
(Score: 2, Interesting) by Anonymous Coward on Friday September 03 2021, @04:21AM
There were some caveats with memory speed with Epyc (at least when we looked into it at work in 2020, so maybe specifics are different for the Epycs that were released this year).
In the 2020 Epycs, 3200 MT/s ram was supported, but would cause latency issues due to Infinity Fabric clock being unable to sync with that frequency. Optimal frequencies for clock sync / low latency memory access were: 2933, 2677 and 2400.
(we were going to buy 3200 MT/s ram, and de-rate it to 2933)
(Score: 0) by Anonymous Coward on Saturday September 04 2021, @03:38AM
Cloudflare are flat-Earthers!