Stories
Slash Boxes
Comments

SoylentNews is people

posted by janrinok on Saturday November 07 2015, @03:12AM   Printer-friendly
from the 8=4 dept.

In 2011 AMD released the Bulldozer architecture, with a somewhat untraditional implementation of the "multicore" technology. Now, 4 years later, they are sued for false advertising, fraud and other "criminal activities". From TFA:

In claiming that its new Bulldozer CPU had "8-cores," which means it can perform eight calculations simultaneously, AMD allegedly tricked consumers into buying its Bulldozer processors by overstating the number of cores contained in the chips. Dickey alleges the Bulldozer chips functionally have only four cores—not eight, as advertised.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Interesting) by edIII on Saturday November 07 2015, @03:59AM

    by edIII (791) on Saturday November 07 2015, @03:59AM (#259810)

    Wow. Is that an interesting idea or what? Plenty of stuff is licensed that way. If he can prove it only has 4 cores in a court of law, then anybody with licenses just got double what they needed, or a slam dunk lawsuit against AMD for the difference.

    It seems logical to me that you would only pay for each literal core, not the extra virtual core from hyperthreading (or similar). If AMD promised a literal 8 cores, but then did some funny 'unusual implementation' where there weren't literally 8 processing cores then this gentleman has them by the short and curlies so to speak.

    The suit alleges AMD built the Bulldozer processors by stripping away components from two cores and combining what was left to make a single “module.” In doing so, however, the cores no longer work independently

    Ironic that in argument that is sure to be about technicalities, that we have nothing technical yet to argue over. Love to know what they mean specifically. From Wikipedia [wikipedia.org] it doesn't look like 8 cores to me.

    AMD has re-introduced the "Clustered Integer Core" micro-architecture, an architecture developed by DEC in 1996 with the RISC microprocessor Alpha 21264. This technology is informally called CMT (Clustered Multi-Thread) and formally called "module" by AMD. In terms of hardware complexity and functionality, this module is equal to a dual-core processor in its integer power, and to a single-core processor in its floating-point power: for each two integer cores, there is one floating-point core. The floating-point cores are similar to a single core processor that has the SMT ability, which can create a dual-thread processor but with the power of one (each thread shares the resources of the module with the other thread) in terms of floating point performance.
    A module consists of a coupling of two "conventional" x86 out-of-order processing cores. The processing core shares the early pipeline stages (e.g. L1i, fetch, decode), the FPUs, and the L2 cache with the rest of the module.
    Each module has the following independent hardware resources:[10][11]
    2 MB of L2 cache per module (shared between the two integer clusters in the core)
    16 KB 4-way of L1d (way-predicted) per cluster and 2-way 64 KB of L1i per core, one way for each of the two cluster[12][13][14]
    Two dedicated integer clusters
    - each one consists of two ALU and two AGU which are capable of a total of four independent arithmetic and memory operations per clock and per cluster
    - duplicating integer schedulers and execution pipelines offers dedicated hardware to each of two threads which increases performance in some multi-threaded integer cases
    - the second integer cluster increases the Bulldozer core die by around 12%, which at chip level adds about 5% of total die space[15]
    Two symmetrical 128-bit FMAC (fused multiply–add capability) floating-point pipelines per module that can be unified into one large 256-bit-wide unit if one of the integer cores dispatches AVX instruction and two symmetrical x87/MMX/SSE capable FPPs for backward compatibility with SSE2 non-optimized software
    All modules present share the L3 cache as well as an Advanced Dual-Channel Memory Sub-System (IMC - Integrated Memory Controller).
    A module has 213 million transistors in an area of 30.9 mm² (including the 2 MB shared L2 cache) on an Orochi die.[16]

    If I'm reading the part in bold correct, it does indeed sound like there isn't really 8 cores.

    --
    Technically, lunchtime is at any moment. It's just a wave function.
    Starting Score:    1  point
    Moderation   +1  
       Interesting=1, Total=1
    Extra 'Interesting' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3  
  • (Score: 2) by frojack on Saturday November 07 2015, @04:36AM

    by frojack (1554) on Saturday November 07 2015, @04:36AM (#259819) Journal

    In terms of hardware complexity and functionality, this module is equal to a dual-core processor in its integer power, and to a single-core processor in its floating-point power: for each two integer cores, there is one floating-point core. The floating-point cores are similar to a single core processor that has the SMT ability, which can create a dual-thread processor but with the power of one (each thread shares the resources of the module with the other thread) in terms of floating point performance.

    So the upshot of that is if these processors were not used for gaming and complex numerical calculation, and reserved for the server market, there's a good chance no one would ever notice this floating point limitation.

    Most of the work done in server situations is integer math, (well, most of it is just byte slinging hither and yon). Encryption may be some of the most taxing work in the server market.

    But I have no idea how those processors were marketed.

    --
    No, you are mistaken. I've always had this sig.
    • (Score: 2) by Pino P on Saturday November 07 2015, @03:06PM

      by Pino P (4721) on Saturday November 07 2015, @03:06PM (#259973) Journal

      Most of the work done in server situations is integer math, (well, most of it is just byte slinging hither and yon).

      Unless the server is, say, transcoding uploaded video to fifteen different formats for streaming to viewers. But perhaps a lot of that can be written in OpenCL and run on the integrated GPGPU. Does a Xeon even have an GPGPU?

      • (Score: 2) by frojack on Sunday November 08 2015, @04:49AM

        by frojack (1554) on Sunday November 08 2015, @04:49AM (#260244) Journal

        Yes, but most streaming stuff isn't transcoded from one format to the other every time someone requests a stream.
        You do it once, and save the file, then chuck what ever format they ask down the socket as fast as the requester can consume it.

        Admittedly, you still have a transcoding task just to arrive at a copy for each format. And maybe these processors do that just fine, and maybe they don't, I donno.

        --
        No, you are mistaken. I've always had this sig.
        • (Score: 2) by Pino P on Sunday November 08 2015, @06:48PM

          by Pino P (4721) on Sunday November 08 2015, @06:48PM (#260435) Journal

          most streaming stuff isn't transcoded from one format to the other every time someone requests a stream.

          If someone is sending a live stream that has few simultaneous viewers, the server might end up serving the transcoded stream at each detail level to one viewer or at most a handful. Even apart from live streaming, I'm told some adaptive streaming platforms do a real-time transcode for a few seconds rather than waiting for the next keyframe to switch detail levels when the Internet connection's throughput changes or when the user fast-forwards or rewinds.

          Admittedly, you still have a transcoding task just to arrive at a copy for each format.

          Even apart from live streaming, uploaders on big video sharing sites such as Dailymotion and YouTube initiate so many transcoding tasks that I shudder to think of how many must be running at once.

  • (Score: 5, Informative) by Hairyfeet on Sunday November 08 2015, @01:21AM

    by Hairyfeet (75) <{bassbeast1968} {at} {gmail.com}> on Sunday November 08 2015, @01:21AM (#260184) Journal

    You are reading it wrong because you are ignoring this part, bold for highlight.."Two symmetrical 128-bit FMAC (fused multiply–add capability) floating-point pipelines per module that can be unified into one large 256-bit-wide unit if one of the integer cores dispatches AVX instruction and two symmetrical x87/MMX/SSE capable FPPs for backward compatibility with SSE2 non-optimized software."

    So each core still has a FPU, it simply has a weaker 128bit FPU that can be combined into a single 256bit FPU if AVX instructions are required. The reason why they did this their engineers have spoken at length about, they believed multicore processing was the future (which it is) and would be upon us as quickly as 64bit computing was (which it wasn't) and so bet on having more cores versus having higher performance per core. If you are like me and are using plenty of multicore aware tasks like transcoding or effects layering? This kicks ass because having high single core performance would be slower than having multicores working on the task, while for someone that used nothing but single process programs it would be a better choice to go for a higher per core performance over having more cores.

    So it isn't a "half core", it is simply a different approach to the same task.

    --
    ACs are never seen so don't bother. Always ready to show SJWs for the racists they are.
    • (Score: 3, Informative) by edIII on Sunday November 08 2015, @02:05AM

      by edIII (791) on Sunday November 08 2015, @02:05AM (#260205)

      Thanks for the explanation

      --
      Technically, lunchtime is at any moment. It's just a wave function.