Intel's Clear Linux Helping AMD EPYC Genoa Hit New Performance Heights
https://openbenchmarking.org/embed.php?i=2212265-NE-2212268NE78&sha=5db1a51c4a7b&p=2
Dual-socket 96-core CPUs for 384 threads (320 usable until Clear Linux gets updated).
Update: Clear Linux Will Now Handle Up To 512 CPU Cores / vCPUs
(Score: 2) by janrinok on Monday January 02, @04:59PM
(Score: 3, Interesting) by DannyB on Thursday January 05, @04:00PM (6 children)
From the Phoronix article . . .
It would seem that Ubuntu, and perhaps upscream Debian could apply the same OpenJDK tuning found in Clear Linux to obtain the same performance wins.
Now remembering back to things I've posted here on SN before about Java . . . at one point Java had a very unfortunate heap size limit of 4 TB. Back before or about the time of covid, that limit was increased to 16 TB. Then later the limit was removed entirely. Also two new GCs (garbage collectors) which finally became production ready make it possible to have many terabytes of memory with max GC pauses of only 1 millisecond. Later, the heap size limit was removed, so Java has no upper limit on the size of memory.
But . . . Linux is now the limiting factor. Last time I read about this, and information seems scarce, the limiting factor is (or was) that user space was limited to 128 TB. Corrections or additional info about this are appreciated.
Also, last I read, Java workloads are still Limited to 768 cpu cores. However hardware does not seem to be approaching this yet.
The two new GCs mentioned above are Red Hat's Shenandoah GC, and Oracle's ZGC.
The commercial contributors to Java development are always interesting to me. Especially because they don't invest resources into this out of the goodness of their hearts. They are doing it for pure profit. Somehow they seem to think that expending resources to improve the open source Java benefits them. The contributors I find amusing are:
I know Java is extremely unpopular in these here parts because of bias, strong beliefs in misinformation, and just plain being stuck in the past. Here are a few points I would make to address those.
If Java were really so terrible it would not have been the top #1 language for over 15 years and in the top 3 languages for over two decades. Obviously Java must be doing something right?
There is no one language that is perfect for every problem. If you think your favorite pet language is the only solution to every problem, then I doubt you really are a professional developer. If there were one perfect language for all problems, then we would all be using it already.
Once upon a time, every single cpu cycle and byte of memory was very expensive and developers were cheap. So it was very important to always optimize for cpu cycles and bytes. Things have flipped. Developers are very expensive and hardware is dirt cheap. You can throw an extra 64 GB of memory into a server for less than the cost of a developer plus benefits for one month. You should be optimizing for dollars, not cpu cycles and bytes. If you have it stuck in your head that you must optimize for cpu cycles and bytes over optimizing for software development and maintenance dollars, then you are officially too old and stuck in the past. If I beat my C++ using competitor to market by six months to a year, then we will take the whole market and my manager and I will laugh all the way to the bank while the C++ developer rants about how efficient his solution is.
GC is here to stay. Most new modern languages in the last two decades, with Rust being the exception that proves the rule, all have GC. There is a fork in languages. There are the languages that are great for working close to the hardware. And there are the languages better for writing applications and solving higher abstract problems unrelated to hardware organization.
GC lowers latency of an application. A server services a transaction. The transaction must earn the cost of servicing that transaction in order to make a profit. That includes the cost of "inefficiennt" GC for servicing this transaction. In a properly set up system the transaction does earn enough to pay the freight for both the workload and the GC together. However with GC, not one single cpu cycle of the transaction was spent on memory management. (no stupid reference counting) The primary workload that serviced the transaction could just malloc away quickly and cheaply. GC threads on other cpu cores will clean up the memory allocation mess left behind -- and do it later. The cost of that GC is paid for by the earnings from the primary workload -- but GC lowered the overall latency of the system in servicing transaction requests. As I said before, if you are optimizing for cpu cycles and bytes instead of optimizing for dollars --
you'reyour doing it wrong. You're stuck in the long gone days of yore.Java code is compiled, twice. First the Java compiler produces JVM bytecode. At runtime, the JVM begins by interpreting this bytecode. The code is dynamically profiled continuously. As soon as a hot spot is discovered that is using a disproportionate amount of CPU that code is quickly compiled by the C1 compiler into un-optimized machine code for a huge performance gain. If the system can prove that the C2 compiler will provide a meaningful gain in performance worth the cost of the C2 compiler's time, then that function will again later be compiled to native code by the C2 compiler. The C2 compiler will spend significant time and effort optimizing -- including aggressive inlining of code to remove function call overhead. The C2 compiler has access to the ENTIRE program and can make optimizations that an ahead-of-time compiler cannot make (eg, C, Pascal, etc). If C2 can prove that this function call can only ever call this one method in this one class, then there is no need to go through the vtable. Just make a direct call.
I could go on. But I am biased based on things I've read here in the past about Java development. Yet I find it amusing to see the amount of money poured into Java development by big name companies, some of them quite surprising, because it makes them money.
How often should I have my memory checked? I used to know but...
(Score: 1, Interesting) by Anonymous Coward on Thursday January 05, @07:20PM (1 child)
Maybe I'm in the minority but I do like Java the language, in particular I like its lack of macros so I can say with increased certainty what a piece of code is actually doing. I don't like Oracle's meddling with Java (their recent attempt to change licensing terms for the product called "Java" for instance); OpenJDK provides some isolation but Oracle does hold a lot of relevant patents on useful parts of Java and could cause significant trouble if they wanted to (and I wouldn't put it past Larry Ellison to make a poor long term decision out of spite). I usually don't like the end-user experience of running things written in Java (nothing feels native, do I have the right JRE version, do I have to monkey around with a classpath, do I need to provide OS-specific launchers for end-users to understand how to run the application, for instance) but for Java on the server this doesn't matter nearly as much.
IIRC semi-recent analyses of GC performance show that it is at least as good as non-GC, if and only if you can afford to throw much more memory at the problem than you would otherwise need, and it can be very difficult to predict how much memory you will need in order to avoid unpredictable latency due to GC pauses. I don't care if the GC normally runs on a different thread these days, if you run out of memory you have to wait for the GC to free up some memory. You say memory is cheap, and it is in the short term, but with modern cloud providers you do pay more for larger instances with more memory, on an ongoing basis, forever; it does add up and solutions requiring less memory can be cheaper over time. Though ongoing developer maintenance also adds up so I don't want to say I disagree with you, just that I don't think it's always as clear cut as what I interpret your argument to be.
Also, while I think Hotspot and C2 and whatnot can be goddamn magical in terms of what can be optimized, other languages can also optimize on a whole program basis as well - by preferring to static link, at least it seems like that's the preference for a lot of newer (non-C non-C++) languages these days. Static linking avoids the Java slow start performance warm-up problem but of course has its own trade-offs.
(Score: 1, Interesting) by Anonymous Coward on Friday January 06, @01:57AM
I think your data is a bit out of date. Modern GCs do not require pausing at all. Your program can scream along at full speed on the execution threads completely unaffected performance-wise by the GC threads. There are other GC techniques that can mitigate the need for conventional TGC completely. There are also tools and techniques, along with a deep understanding of how the GC system works, that reduce your memory overhead. In fact, I know of multiple production deployments that run without GC without any memory pressure because they know how to manage and use their JVM to the full potential. And such expertise makes memory, even at ridiculous cloud pricing, down right cheap. This is doubly so when you remember memory is charged by the service tier and many large Java users don't use the public cloud for that kind of workload anyway.
I do agree with your "non-native feel" though. However, for the people who make big money running Java, that is a feature and not a bug. In some respect I think you inadvertanly nailed it as "Big Java" trading the long tail of small client programs for the big money on the server and large-scale users.
(Score: 2) by takyon on Friday January 06, @12:06AM (1 child)
Logical or physical?
If it's logical, that limit could be hit soon with dual-socket of AMD's Zen 5 EPYC, or Intel's Sierra Forest E-core Xeon extravaganza. If not those two, then the following generation of each.
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 1, Informative) by Anonymous Coward on Friday January 06, @02:34AM
There was a limit for physical "processors," which isn't exactly analogous to either logical or physical CPU cores. It was a limitation put in place to keep certain things in the JVM simple in terms of managing a large number of architectures for multi-processing. They had raised it over the years while improving that part of the VM and still trying to hide a lot of complexity involving processors sharing silicon, heterogeneous designs, NUMA, etc. However, IIRC there are later versions that removed that limitation completely and some VM implementations don't have a limit at all, but there is still the requirement that you need to have the memory for that many cores. But that doesn't hold everywhere, as there are some other VMs and vendors that do implement a limit on "cores" as they define (and charge for) them.
(Score: 1, Interesting) by Anonymous Coward on Friday January 06, @04:40AM (1 child)
For what it is worth, I still show some of your journal articles and comments to some as a basic introduction to all things Java. Some find your detractors funny and others find it sad. However, I show them less and less anymore, as I am becoming less inclined to be personally connected to this site as the "not even wrong" and worse take over.
(Score: 2) by DannyB on Friday January 06, @03:29PM
Thank you.
How often should I have my memory checked? I used to know but...