AMD released Threadripper CPUs in 2017, built on the same 14nm Zen architecture as Ryzen, but with up to 16 cores and 32 threads. Threadripper was widely believed to have pushed Intel to respond with the release of enthusiast-class Skylake-X chips with up to 18 cores. AMD also released Epyc-branded server chips with up to 32 cores.
This week at Computex 2018, Intel showed off a 28-core CPU intended for enthusiasts and high end desktop users. While the part was overclocked to 5 GHz, it required a one-horsepower water chiller to do so. The demonstration seemed to be timed to steal the thunder from AMD's own news.
Now, AMD has announced two Threadripper 2 CPUs: one with 24 cores, and another with 32 cores. They use the "12nm LP" GlobalFoundries process instead of "14nm", which could improve performance, but are currently clocked lower than previous Threadripper parts. The TDP has been pushed up to 250 W from the 180 W TDP of Threadripper 1950X. Although these new chips match the core counts of top Epyc CPUs, there are some differences:
At the AMD press event at Computex, it was revealed that these new processors would have up to 32 cores in total, mirroring the 32-core versions of EPYC. On EPYC, those processors have four active dies, with eight active cores on each die (four for each CCX). On EPYC however, there are eight memory channels, and AMD's X399 platform only has support for four channels. For the first generation this meant that each of the two active die would have two memory channels attached – in the second generation Threadripper this is still the case: the two now 'active' parts of the chip do not have direct memory access.
This also means that the number of PCIe lanes remains at 64 for Threadripper 2, rather than the 128 of Epyc.
Threadripper 1 had a "game mode" that disabled one of the two active dies, so it will be interesting to see if users of the new chips will be forced to disable even more cores in some scenarios.
Serious question, is this really useful for gamers today in 2018?
I only use a couple cores for minecraft and my minecraft server even with extensive mods, so I don't know about twitchy FPS sequels maybe those use all the cores, I donno. 28 cores seems a little far fetched. I think my server has 4 allocated but its usually not using all 4.
Ironically for a product marketed to gamers, I have a real world use for multicore in my vmware cluster in the basement. In terms of system design where only the weakest link matters, I would think my memory BW or SSD would saturate long before all 28 cores are in use, but who knows, maybe not.
You mistyped "why not?"
What can I say, you get addicted to the virtualization lifestyle at work, next thing you know you got a cluster in the basement. All legal if for non-commercial use and the ESX-experience VMUG deal. I spend a lot more on electricity and hardware than I do on VMUG membership, thats for sure, LOL.
I have somewhat seriously discussed placing a server in the attic and running wiring out around the house to allow thin clients or even just monitor/keyboard/mouse stattions to but placed in any room so that all people in the house have access to their own computing environment from anywhere in the house. Seems like the perfect use for a personal vmware cluster. A basement is fine, too.
Game authors _could_ use massively multi-threaded processing for many things (think: processing NPC logic and decisions), but game authors are driven by the mass market: what does their buying audience have access to, so, no, I doubt many game authors are going to go massively multi-threaded anytime soon. Lots of "gaming systems" are still just dual core.
If they develop their games on workstations, and test them on both the workstations and dual or quad-core PCs, they should be able to design engines/games that use even as many as 64 threads. Games like Skyrim ran pretty well on low-end systems as well as high-end systems. That's the way it should work: set your minimum requirements, as low as possible if you want to ensure people can at least run it, but scale to use everything available.
While Steam stats show many 2-4 core users, we now have 6-8 core "mainstream" chips from both AMD and Intel. The latest consoles also allow game developers to access about 6-7 of the 8 cores. The writing is on the wall, and people should design their stuff to work well with 64 or more threads even if customers don't have that yet. Even if the utilization is just running some basic parallel RadiantAI type stuff on every thread, but doing most of the work on 4 cores, so be it. The bleeding edge users should be able to enjoy a game mode that allows for thousands of NPCs.
people should design their stuff to work well with 64 or more threads even if customers don't have that yet
Who do you think runs EA, Activision and UbiSoft? Where's their motivation?
I agree with you, but I don't think that they do.
The current question tends to be : Will it run on Xbox/PS4 at reasonable settings, and how much does the extra PC eye candy cost ?
massively multi-threaded processing for many things (think: processing NPC logic and decisions),
Many games have a multiplayer option. So a popular way to reduce bandwidth is that the NPC logic and decisions are actually deterministic based on committed player input - which doesn't take that much bandwidth. The same actions by the same players at the same time will have the NPCs doing the same exact things. If the NPCs were not so deterministic the various client PCs will have to send each other zillions of detailed updates of the various independent NPCs, instead of mainly only sending the player actions to each other.
Making it massively multithreaded AND deterministic is possible but not so simple if you actually want it to be faster... And even less simple if you want it to be seem even more intelligent...
You could have the game work differently depending on whether it's multiplayer or single player but that adds to the complexity.
Intelligent NPC, now there's an Oxymoron, if I ever saw one.
The AMD one is certainly not marketed to gamers. The "game mode" on Threadripper cripples the processor which is somehow supposed to improve game performance. If all you are doing is playing games, you don't want Threadripper at all.
I am considering Threadripper for my next video / photography processing workstation. Though I don't know if any software can really make use of this many cores yet.
Only certain games, like DiRT Rally, require the Threadripper "game mode". In my quick search I didn't see any lists that show which games require game mode, don't require game mode, or can use all 16 (soon 32) cores. But maybe those lists are floating around somewhere.
Game developers know that Intel and AMD are putting out non-Xeon/Epyc CPUs with a lot more cores. The 16-core TR 1950X, 18-core Core i9-7980XE. Now shooting up to 28 or 32 cores, and maybe continuing to 48 or 64 in the near future. So newer game engines should be able to handle running on these, or even use all of the cores in some cases.
Well, they are going to have to make better use of multi core and gpu based systems and faster memory busses. The Ghz wars have ended, and Moore's law has expanded in to multi core processors. The coming generations of ships are going to add performance with more cores, not more Ghz.
i suspect kedenlive could.
It's good for playing a video game while live streaming and having Google Chrome up with a dozen tabs. Most video games don't utilize more than four cores, but over the past few years more games have come out which utilize six or even eight, such as Overwatch. I don't know of any video game that utilizes more than eight cores, but I'm sure that's only a matter of time.
I wouldn't get it for gaming, I would guess that a $300-$500 CPU and, say, $1500 invested in the right GPU would outdo a $1000+ 32 core machine and $500 less spent on GPU.
I rip all of my Blu Rays and DVDs to disk and then reencode them to H.265 (since that's the most bandwidth/disk space -efficient codec my streaming media devices can handle). So I could find a use for one of these machines for a few months. But as it is, right now I have an AMD FX-8320. It's a joke against the cutting edge, really, 4 cores, 8 threads, and the dedicated GPU is even older, form 2010. But I have an SSD drive and 32GB of RAM. I have my H.265 encoding running continuously in the background on 4 threads. The other 4 threads run Firefox, Chrome, Minecraft, a web server, and 3 VMs. I lock the screen and walk away, and my kids sit down to use Chrome and Mineraft. It never slows down.
Now granted, if I was into more modern games it wouldn't work.
What software are you using to re-encode/convert to a single video file?I haven't done it in a very long time and the landscape has completely changed.
Not the GP, but MakeMKV followed by HandBrake seems to be the choice lately.
I am on Linux, but everything I'm doing can be done on Windows. I use MakeMKV to rip films as-is and then ffmpeg to convert to H.265 video and AAC audio. AAC is the audio codec on DVDs and it's lower quality than the audio codecs on Blu Rays, but I can't hear any differences with my mediocre sound system. I've had problems with streaming video to PCs and Android television boxes using the Blu Ray audio codecs, that's why I do the conversion. I use .mkv files instead of .mp4 or similar because Blu Ray subtitles can't be stored in .mp4 files without some kind of conversion process.
This processor? doubtful. Currently using the Ryzen 8 core version and it's excessive. I play a coop game called Vermintide 2 that starts a local server that other people connect to. When i host we get wrecked so bad. There is enough spare CPU for the AI director to give you a really bad (or Fun?) day.
The nice part is being able to leave all your normal programs open when you play games. You could permanently give 4 threads to your sooper seekret VM and you wouldn't notice any game slowdown at all. Most people seem to use these high core-count CPUs for streaming their play sessions to the world.
I'm not sure what the real bottleneck for performance on my current machine is, but I'm thinking its' got a whole lot more to do with developers who can't code a stable piece of software. As opposed to them legitimately maxing out my CPU, RAM or GPU. Could be that my 8GB RX480 is having issues, but I doubt it. I also got way more RAM than I will need for the foreseeable future. VR does demand a bit more CPU, RAM, and GPU power due to the nature of the beast. No one's really taking advantage of my 8-core CPU, though.
SSDs makes loading anything feel more snappy. Comparing a HDD to a SSD is like comparing a VHS to a DVD. There's a vast improvement in performance when upgrading to a SSD. That would be the single best thing to upgrade, if you want to feel like you're going to make a serious impact on your PC performance.
Faster RAM is notable, but probably not very noticeable when you go from 2400 to 3200mhz. You just won't see that much of a performance gain. Going from 4GB to 8GB of RAM could on the other hand make a much bigger impact, if you don't have enough to keep the 5,000 tabs up in your Browser of choice.
GPU upgrade from 128-bit to 256-bit, or 256-bit to 512-bit will typically be a major improvement over the previous GPU. There are other factors like how much RAM is on the card and speed of the card, but the higher bandwidth cards usually follow an upward trend regarding those things.
Also, regarding Multiplayer performance. Sometimes you just need a faster internet connection, though latency is a key issue. Other times it's down to the software developers that can't code.
Minor jump in CPU performance? Bottom of the list of "Maybe this will help my computer run faster".
We tried a Threadripper Gen1 for our compiles, but it turned out not faster than our Xeon, despite much more memory bandwidth. Had to go to an Intel in the end to actually go faster.Now that Zen+ claims double-digit percentage gains in cache latencies compared to Zen, hopefully it can offset the deficit against intel on single-thread performance (sadly still a significant part of the compile).
Should be interesting to see what they do with the pricing. Maybe 50% more than the Threadripper 1950X ($1000) for the 32 core variant, and then when a 7nm Threadripper 3 comes out, drop it back down to $1000 for 32 cores.
We can't fully utilize 32 right now, so I'm all for them making the flagship go up 50% (for twice as many cores) if the 12- or 16-core chips, now "mid-range" drop by 25% in the process.
Intel was teasing 80+ core CPUs in 2006... I guess they're finally starting to deliver something approaching that old tease.
Indeed. You only need a dedicated circuit in your house to power it, and an air conditioner to cool it. This is far beyond practical. Intel won't be releasing this...ever. It's simply a desperate attack to take attention away from AMD recent success.
The legacy of that would be Xeon Phi [wikipedia.org] with 57-72 cores.
Just a test post
what a nerd...
It's taken longer than I had hoped, but there are now plenty of languages with the right features, and various frameworks that make it much easier to take advantage of using any number of cores to handle "embarrassingly parallel problems".
And there are plenty of embarrassingly parallel problems. Some problems can be transformed into parallel problems. Just look for long iterations over items where the processing of each item is independent from other items.
You can also re-think algorithms.
I was plotting millions, then tens of millions of data points. It was slow. I was doing the obvious but naive operation of drawing a dot for each data point. This meant a long loop, and invoking a graphics subsystem operation for every point. Even though the plot is being drawn off screen.
Then I observed a phenomena. This is like having a square tiled wall (eg, the pixels) and throwing color paint filled balloons at the wall (eg, each plot point). After many plot points, older plot points are obscured by newer ones.
So let's re-think. Imagine each plot point is now a dart with an infinitely small top. The square tiles on the wall are now "pixel buckets". Each square ("pixel bucket") accumulates the average (of the original data, not the color). The average of data points (eg, dart points) that hit the wall in that tile. Now we're throwing darts (data points) at the wall instead of paint filled balloons. (each pixel bucket has a counter and an accumulated sum, thus an average.)
At the end, compute the color (along a gradient) for the accumulated average in each pixel. Now the number of graphical operations is to set one pixel for every pixel bucket. The number of graphical operations is tied to the number of pixels, and unrelated to the number of input data points. The entire result is:1. faster2. draws a much more finely detailed view (and don't say it is because the original dot size plotted was too big)
Now I can (and did) take this further and make it parallel. Divide up the original data points into groups of "work units". When each cpu core is free, it consumes the next "work unit" in the queue. It creates a 2D array of "pixel bucket" averages. Iterates over the subset of data points in that work unit, and averages each point into which ever pixel it would land in.
At the end, pairs of these arrays of pixel values are smashed together. (Simply add the counters and sums together in corresponding pixel buckets.) Then on the final array, once again, determine colors and plot.
The result is identical, but now much faster. Not n-cores * original faster, but close. There is overhead. But using 8 cores is way worth it.
My point, if you think about it, you can find opportunities to use multiple cores. Just put your mind to it. Remember there is overhead. So each "work unit" must be far more than worth the overhead to organize and process it under this model.
Why not use OpenCL to run it on a GPU?
So many things to do, so little time. I'm sure you know the story.
The project is written in Java. There are two (that I know of) projects for Java to support OpenCL in Java. I have looked into it. It is a higher bar to jump over. I might try it with a small project first. It's a matter of time and energy to do it. I'm interested in trying it.
I have to write a C kernel, and there are examples, and have that code available as a "string". (Eg, baked into the code, retrieved from a configuration file, database, etc.) I have to think about the problem very differently to organize it for OpenCL. It is a very different programming model than conventional CPUs. Basically OpenCL is parallelism at a far finer grained approach than the "work units" I described. The work done my my "work units", and thus the code, could be arbitrarily complex. As long as work units are all independent of one another. The very same code to do the work, works on a single cpu core, or multiple cores, if you have them. With OpenCL, I need to have two sets of code to maintain. The OpenCL version, and at least a single-core version for when OpenCL is not available on a given runtime. (Remember, my Java program, the binary, runs on any machine, even ones not invented yet.)
Thus, there is a philosophical issue. What I would rather see is More Cores Please. Conventional cores. Conventional architecture programming. It seems that if you had several hundred cores that were more general purpose rather than specialized for graphics, this would STILL benefits graphics. But in a much more general way.
Let me give an example of a problem that would require serious thinking for OpenCL. A Mandelbrot set explorer. My current Mandelbrot set explorer (in Java) uses arbitrary precision. Thus it does not "peter out" once you dive deep enough to exhaust the precision of a double (eg, 64 bit float). By allowing arbitrary precision math, you can dive deeper and deeper. A Mandelbrot explorer is another embarrassingly parallel problem. "Work units" could even be distributed out to other computers on a network. You just need to launch a JAR file on each node. (And those nodes don't even have to be the same CPU architecture or OS.) In a single kernel in OpenCL, I would need to be able to iterate X number of times on a pixel, using arbitrary precision math, within the bounds of how kernels work. Multiple parameters, each parameter being a buffer (an array) where different concurrent kernels operate on different elements in the parameter buffers.
It seems that with all the silicon we have now, maybe it's time to start building larger numbers of general purpose cores. This would much more rapidly produce benefits in far more every day applications than Open CL. IMO.
How many dies fit into a 30cm platter? What's yield rate for 15-core dies?
i'd rather have a single core that was 28 times faster.
IBM Demonstrates 100GHz Graphene-Based Transistors [popsci.com]Diamond on silicon chips are running at 100 Gigahertz and can also make power chips for directing 10,000 volts [nextbigfuture.com]Intel TeraHertz [wikipedia.org]Smaller and faster: The terahertz computer chip is now within reach [sciencedaily.com]
Not if it makes my keyboard auto repeat 28x faster, the way it seems to be programmed if I'm in a low runlevel.