Slash Boxes

SoylentNews is people

posted by hubie on Friday July 05, @02:02PM   Printer-friendly

Arthur T Knackerbracket has processed the following story:

The Austin-based semiconductor company InspireSemi announced that it has tapped out its first Thunderbird "supercomputer-on-a-chip" comprising 1,536 64-bit superscalar RISC-V CPU cores. Four chips can be installed on a single accelerator card, in a form factor similar to a GPU. This configuration brings the total number of cores per card to 6,144, with the potential to scale to multi-processors in a single cluster connected using high-speed serial interconnect.

[...] Thunderbird utilizes standard CPU programming models and compiles without creating workloads on custom platforms like Nvidia's CUDA or AMD's ROCm. This means existing HPC workloads running on CPUs should have little to no custom code to run in Thunderbird. Also, the product is adaptable to existing server infrastructure as it's a PCI add-on card, allowing InspireSemi to reach more customers who do not have the funds to build out new infrastructure and facilities.

According to InspireSemi, the processor's open-source design and agnostic software allows them to target many industries: "Thunderbird accelerates many critical applications in important industries that other approaches do not, including life sciences, genomics, medical devices, climate change research, and applications that require deep simulation and modeling" said the company's founder and CTO Andy Gray.

[...] The speed at which companies are utilizing open-source solutions is remarkable. The Unified Acceleration Foundation's (UXL) mission is to develop universal standards for vendor-agnostic hardware and software, with Intel being one of the main contributors through its oneAPI framework.

If open-source initiatives for building a more open platform continue to gain momentum, then companies like InspireSemi may have a bright future.

Original Submission

This discussion was created by hubie (1068) for logged-in users only. Log in and try again!
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 5, Informative) by PiMuNu on Friday July 05, @02:15PM

    by PiMuNu (3823) on Friday July 05, @02:15PM (#1363178)

    One of my collaborations use kokkos which is a platform-agnostic parallelisation layer, sitting on top of either MPI or CUDA or serial mode. []

  • (Score: 1, Interesting) by Runaway1956 on Friday July 05, @02:33PM (4 children)

    by Runaway1956 (2926) Subscriber Badge on Friday July 05, @02:33PM (#1363179) Journal

    I'm not going to go overboard, trying to detract from this achievement. But - I have a GPU doing Folding at Home with 2,560 cores. That was high mid-range GPU when I bought it, today's high end cards will put that to shame. What's more, you can connect multiple GPUs together, just as you can connect multiple CPUs on a single motherboard, or, card in this case.

    I'll repeat, this chip is impressive, but I don't see them displacing GPUs real soon. I'm sure that some use cases will favor the RISC-V, and other use cases will favor Nvidia's GPUs.

    If RISC-V can actually eat Nvidia's lunch, fine. Demonstrate it. What looks like an overhyped marketing statement isn't going to cut it, all on it's own. Although, I would be happy if InspireSemi would send me a chip or four to play around with. ;^)

    We've finally beat Medicare! - Houseplant in Chief
    • (Score: 4, Informative) by DrkShadow on Friday July 05, @03:15PM (2 children)

      by DrkShadow (1404) on Friday July 05, @03:15PM (#1363182)

      these cores aren't doing vector processing.

      • (Score: 2) by RS3 on Friday July 05, @11:59PM

        by RS3 (6367) on Friday July 05, @11:59PM (#1363231)

        Not through an inherent API, but you could fairly easily write an API that would use the many cores to do just that: parallel simple math, or whatever functions you wish. I could even argue that you're not constrained by the vector processor's API- again, write your own code to be copied to 1024 cores, or whatever.

        One of the reasons Intel has done well is they've given away really good (arguable I'm sure) tools, example code, etc. InspireSemi would do well to provide libraries and example code to use the cores to do vector operations. Might even be faster than a vector processor- all depends on RAM / cache access speeds.

      • (Score: 3, Informative) by pe1rxq on Saturday July 06, @10:12AM

        by pe1rxq (844) on Saturday July 06, @10:12AM (#1363275) Homepage

        Actually it has vector, SIMD, tensor, mixed precision fp, AI and crypto extensions according to this ieee presentation []
        Alghough I could not find any source giving exact specs of the cores yet...

    • (Score: 5, Informative) by DannyB on Friday July 05, @03:54PM

      by DannyB (5839) Subscriber Badge on Friday July 05, @03:54PM (#1363184) Journal

      The HUGE advance this new chip represents is for us software developers. Standard programming languages and techniques. No need to try to adapt our problems to run within the constraints of how GPUs work. These are just normal everyday RISC V cores, just lots of them in parallel. The only problem, which is ever present, is how to coordinate all of the work across many processors. This works best for the classes of problems called "embarrassingly parallel" problems. As an example, if you are generating an image of the Mandelbrot set, the calculations of each pixel in the image are completely independent of the neighboring pixels. THAT is embarrassingly parallel.

      I am reminded of a Steve Jobs quote from the daze of my youth many decades ago: the software tail that wags the hardware dog.

      Trump is a poor man's idea of a rich man, a weak man's idea of a strong man, and a stupid man's idea of a smart man.
  • (Score: 3, Insightful) by martyb on Friday July 05, @04:01PM

    by martyb (76) Subscriber Badge on Friday July 05, @04:01PM (#1363185) Journal

    My first computer had 4,096 of 8-bit memory. (I think; maybe it had 8 Kb).

    This has: "1,536 64-bit superscalar RISC-V CPU cores".

    Let's see... 1,538 x 64 bits. That is: 1,538*64/8 = 12,304 compute bits.

    This has more compute cells than my computer had bits of memory!

    Wit is intellect, dancing.
  • (Score: 3, Interesting) by hendrikboom on Friday July 05, @06:01PM

    by hendrikboom (1125) Subscriber Badge on Friday July 05, @06:01PM (#1363197) Homepage Journal

    1,536 cores.
    I wonder how they access memory.
    Does each code have some local memory of its own?
    Or do they access memory elsewhere?
    How do all those cores access memory? Does the memory have enough bandwidth? How is mutual exclusion handled?

  • (Score: 2) by VLM on Friday July 05, @08:27PM (1 child)

    by VLM (445) on Friday July 05, @08:27PM (#1363207)

    total number of cores per card to 6,144

    Just for the LOLs if you could get vmware for RISCV, which you thankfully cannot AFAIK, vmware would currently charge "about" $430080/yr for that card, we can just round that to half a mil/yr.

    Just an example of how "the system" is operating far behind technology.

    • (Score: 4, Interesting) by bzipitidoo on Friday July 05, @09:30PM

      by bzipitidoo (4388) on Friday July 05, @09:30PM (#1363211) Journal

      The only reason VMware even exists is that the x86 architecture stinks at virtualization, and needs some software assistance to pull it off. x86 has improved on that front, but it still isn't fully virtualizable. In particular, it's that GPU of all things that is now the main barrier. I am sure the designers of RISC-V took virtualization into account, and that the architecture doesn't need any software assistance.