SoylentNews Comments | A $1,499 Supercomputer on a Card?

A $1,499 Supercomputer on a Card?

posted by n1 on Thursday April 10 2014, @07:39PM

from the will-it-play-crysis-though dept.

Bytram writes:

A $1,499 supercomputer on a card? That's what I thought when reading El Reg's report of AMD's Radeon R9 295X2 graphics card which is rated at 11.5 TFlop/s(*). It is water-cooled, contains 5632 stream processors, has 8 GB of DDR5 RAM, and runs at 1018MHz.

AMD's announcement claims it's "the world's fastest, period". The $1,499 MSRP compares favorably to the $2,999 NVidia GTX Titan Z which is rated at 8 TFlop/s.

From a quick skim of the reviews (at: Hard OCP, Hot Hardware, and Tom's Hardware), it appears AMD has some work to do on its drivers to get the most out of this hardware. The twice-as-expensive NVidia Titan in many cases outperformed it (especially at lower resolutions). At higher resolutions (3840x2160 and 5760x1200) the R9 295x2 really started to shine.

For comparison, consider that this 500 watt, $1,499 card is rated better than the world's fastest supercomputer listed in the top 500 list of June 2001.

(*) Trillion FLoating-point OPerations per Second.

Starting Score:

point

Moderation

Insightful=1, Total=1

Extra 'Insightful' Modifier

Total Score:

This discussion has been archived. No new comments can be posted.

A $1,499 Supercomputer on a Card? | Log In/Create an Account | Top | 23 comments | Search Discussion

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.

Re:Path of least resistance to use this power?(Score: 2, Insightful) by No.Limit on Thursday April 10 2014, @10:38PM

by No.Limit (1965) on Thursday April 10 2014, @10:38PM (#29744)

I think for GPU computing you have the option between OpenCL, Nvidia's CUDA and OpenACC (there may be more that I don't know of). OpenMP is still CPU only as far as I know, though I believe to have read that OpenMP wants to support GPUs too sometime.

I don't know much about OpenCL, but it's an open standard and supported on many platforms. I believe it's quite similar to CUDA.

Nvidia's CUDA works only on Nvidia GPUs (so certainly not on this AMD one). It has lots of good tools (profilers, debuggers etc), documentation, examples, video-tutorials. It's works very well and it gives you a lot of control over the GPU.

OpenACC is a younger standard for GPU coding. It's on a much higher level than both OpenCL and CUDA, but you still get a pretty good amount of control by specifying additional information for the compiler. There are some proprietary compilers that support it (e.g. from cray or PGI). GCC wants to support OpenACC as well, but I don't think they're very far at the moment.

Now for SIMD instructions, pipelining and cache structures: GPUs are fundemantally different than CPUs.
A GPU core is much much simpler than CPU core. To improve sequential execution CPUs have added a lot of complexity (branch prediction, caching, out of order execution, pipelining etc).
However, GPUs have mostly focused on parallel performance for a long time. So instead they kept the cores simple and made sure to add more cores and made sure that adding more cores scales well.

So because GPUs are already so well optimized for parallel computing you don't have do to a lot yourself when it comes the details. You may not even be able to code in assembly, but only in C.
You mainly want to make sure that the overall structure is optimized well.

So that means using caches efficiently (the usual struct of arrays instead of array of structs, cache friendly access patterns etc). In CUDA the cores are divided into blocks that have a shared faster memory (like a cache) over which you have control meaning you can load data manually.
You want to make sure that you divide the work well over the blocks and cores. And if you have to transfer a lot of data from or to GPU memory (over the slow PCIe), then you want to make sure that you don't block computation with the transfer (you can transfer data and compute things at the same time).

Parent

Starting Score:	1		point
Moderation		+1
Insightful=1, Total=1
Extra 'Insightful' Modifier		0

Total Score:		2

Moderator Help

SoylentNews

SoylentNews is people

Navigation

Sections

SoylentNews

A $1,499 Supercomputer on a Card?

Re:Path of least resistance to use this power?(Score: 2, Insightful) by No.Limit on Thursday April 10 2014, @10:38PM