Stories
Slash Boxes
Comments

SoylentNews is people

posted by cmn32480 on Thursday May 18, @06:22PM   Printer-friendly
from the where-no-GPU-has-gone-before dept.

AMD has announced the Radeon Vega Frontier Edition, a high-end GPU based on a new architecture (Vega 1) which will launch in June.

Unlike some other recent AMD GPUs such as the Radeon Fury X, the Radeon Vega card has half precision compute capability (FP16 operations) that is twice as fast as single precision compute. AMD is advertising 13 TFLOPS single precision, 26 TFLOPS double precision for the Radeon Vega Frontier Edition.

The GPU includes 16 GB of High Bandwidth Memory 2.0 VRAM. The per-pin memory clock is up to around 1.88 Gbps, but total memory bandwidth is slightly lower than the Radeon Fury X, due to the memory bus being cut to 2048-bit from 4096-bit. However, the Fury X included only 4 GB of HBM1. The new card could include four stacks with 4 GB each, or it could be the first product to include 8 GB stacks of High Bandwidth Memory, a capacity which has not been sold by Samsung or SK Hynix to date.

The new GPU is aimed at professional/workstation users rather than gamers:

As important as the Vega hardware itself is, for AMD the target market for the hardware is equally important if not more. Vega's the first new high-end GPU from the company in two years, and it comes at a time when GPU sales are booming. Advances in machine learning have made GPUs the hottest computational peripheral since the x87 floating point co-processor, and unfortunately for AMD, they've largely missed the boat on this. Competitor NVIDIA has vastly grown their datacenter business over just the last year on the back of machine learning, thanks in large part to the task-optimized capabilities of the Pascal architecture. And most importantly of all, these machine learning accelerators have been highly profitable, fetching high margins even when the cards are readily available.

For AMD then, Vega is their chance to finally break into the machine learning market in a big way. The GPU isn't just a high-end competitor, but it offers high performance FP16 and INT8 modes that earlier AMD GPU architectures lacked, and those modes are in turn immensely beneficial to machine learning performance. As a result, for the Vega Frontier Edition launch, AMD is taking a page from the NVIDIA playbook: rather than starting off the Vega generation with consumer cards, they're going to launch with professional cards for the workstation market.

To be sure, the Radeon Vega Frontier Edition is not officially branded as a Pro or WX series card. But in terms of AMD's target market, it's unambiguously a professional card. The product page is hosted on the pro graphics section of AMD's website, the marketing material is all about professional uses, and AMD even goes so far as to tell gamers to hold off for cheaper gaming cards later on in their official blog post. Consequently the Vega FE is about the closest analogue AMD has to NVIDIA's Titan series cards, which although are gaming capable, in the last generation they have become almost exclusively professional focused.


Original Submission

Display Options Threshold/Breakthrough

Reply to Article

Mark All as Read

Mark All as Unread

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: -1, Spam) by Anonymous Coward on Thursday May 18, @07:23PM

    by Anonymous Coward on Thursday May 18, @07:23PM (#511795)

    mmmnnnnnnn, mmmmnnn!

    what I would do for some raw ostrich pussy!

    go on, flap those wings against me, OH YEAH!

  • (Score: 2) by DannyB on Thursday May 18, @09:06PM (1 child)

    by DannyB (5839) on Thursday May 18, @09:06PM (#511833)

    It seems that every year the silicon gets more and more horsepower*. Clock rates don't seem to be climbing. But we get more done per clock cycle, and with more functional units, cores and "psuedo cores". For ten years I have believed that core count would rise more rapidly than it has. Especially since languages and programming techniques have begun to adapt to the concept of using more processing elements. Isolate what work units can be done on a single processing element, independent of other processing elements. Does it work on 8 cores? What about on 1000 cores?

    Then there are GPUs. Impressive beasts. But I do not use GPUs for graphics, and have never been a gamer**. It takes effort to exploit the hardware in a GPU. I've recently begun dabbling with OpenCL, and writing kernel in C. I've long wondered about the possibility of having more but simpler cores. What if I could have, say 64 cores, that were about the complexity of a simple variation of the ARM instruction set, for example, but programmed in a much more conventional way than a GPU. This would seem to be vastly more useful than a GPU for non-graphics parallel execution. They wouldn't necessarily even need to have shared memory as long as there was reasonably fast communication. I would still have to organize my problem into independent units. But it wouldn't be single instruction multiple data. I could hand off work units to processing elements. Work units may turn out to complete in different amounts of time. But with a queue, work units (eg, the input data to the same code) could be fed to the next available processing element. Suppose each work item were processing pixels, then if you pick a suitable sized block of pixels for each work unit, then you could keep each processing element busy for a couple of seconds.

    Just to dream, an interactive mandelbrot program that can deep dive using multi-precision arithmetic because a double is just not enough. And I know how to do it with only integers. Doing multi precision, eg multi word arithmetic in an Open CL kernel will be a trick. At least at my present naive level of OpenCL experience.

    If there were enough conventional general purpose cores, then would the need for GPUs, as they are today, eventually go away? In other words, would hardware dedicated to a special purpose be needed if there were enough parallel general purpose silicon with many cores? How much easier would graphics programming be? How much easier would this be to program things other than graphics? How many more real world problems might be solved using such hardware if it were ever widely available? (Of course, once upon a time it was a dream to ask what if microprocessors could run at 100 MHz.)

    * cpu horsepower = the amount of thinking that can be done by one horse in one day

    ** A long time ago in a slashdot far, far away I once had a sig: "gamers are the root of all evil". Because gamers used games as an excuse to run Windows.

    • (Score: 2) by Snotnose on Thursday May 18, @11:19PM

      by Snotnose (1623) on Thursday May 18, @11:19PM (#511889)

      For ten years I have believed that core count would rise more rapidly than it has.

      Parallel processing is *hard*. In the mid 80s my company slightly modified our existing product so you could slide more CPUs into it (think S-100 bus tech). If memory serves we could get 8 CPU cards in the unit, or fewer than 8 and dedicated cards (system had 8 slots and a bunch of cards you could plug into it).

      We actually had some version of Unix running the whole shebang, but the whole balancing act of keeping all CPUs fully occupied without swamping the bus was a bitch. Ian, the guy in charge, was a PhD candidate at UCSD at the time, hella smart guy. Ended up moving to the bay area and getting richer than rich.

      The product did fairly well, not spectacular.

      / MIMD - multiple instruction multiple data - a real bitch. All CPUs running different code on different data
      // SIMD - Single instruction multiple data - think graphics card
      /// MISD - Multiple instruction Single data - What we did, missile telemetry stream interpreted multiple ways
      //// MIMD - Multiple instruction, multiple data. We did not do this

  • (Score: 2) by opinionated_science on Friday May 19, @11:52AM

    by opinionated_science (4031) on Friday May 19, @11:52AM (#512122)

    I think it may be 13TF single, 26 half-precision and 7 or so double...

(1)