Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 14 submissions in the queue.
posted by martyb on Wednesday October 30 2019, @06:14AM   Printer-friendly

https://www.zdnet.com/article/top-linux-developer-on-intel-chip-security-problems-theyre-not-going-away/

Greg Kroah-Hartman, the stable Linux kernel maintainer, could have prefaced his Open Source Summit Europe keynote speech, MDS, Fallout, Zombieland, and Linux, by paraphrasing Winston Churchill: I have nothing to offer but blood sweat and tears for dealing with Intel CPU's security problems.

Or as a Chinese developer told him recently about these problems: "This is a sad talk." The sadness is that the same Intel CPU speculative execution problems, which led to Meltdown and Spectre security issues, are alive and well and causing more trouble.

The problem with how Intel designed speculative execution is that, while anticipating the next action for the CPU to take does indeed speed things up, it also exposes data along the way. That's bad enough on your own server, but when it breaks down the barriers between virtual machines (VM)s in cloud computing environments, it's a security nightmare.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by RamiK on Saturday November 23 2019, @12:33AM (2 children)

    by RamiK (1813) on Saturday November 23 2019, @12:33AM (#923577)

    I'm talking about a tagged architecture with an ownership hierarchy so the capabilities in question are implicit through the parent pid. Again, this is going to require its own kernel.

    As for the instruction width thing, it's basically a way to parallelize decoding to some extent. If the instructions are fixed length and always store the address at the same place, you can fetch while still decoding. I probably got it off an Arvind's paper or something.

    You're looking at hundreds of man years of work. And at the end, you'd get something roughly equivalent to Linux, with more security. Good luck persuading someone to pay for that (trust me - I've tried!).

    NT for the desktop. iOS for the smartphone. Fuchsia now for, eh, not sure. Like I said, a commercial team. Trick is finding a target application that wouldn't need legacy while justifying a whole new everything. AR/VR seem like a good fit with those massive graphics compute requirement and incompatibility with regular phone and desktop UIs. It's also 5-10 years down the line since the screens aren't good enough right now so that's plenty of time for a big company to start working on it.

    --
    compiling...
    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 2) by TheRaven on Saturday November 23 2019, @08:52AM (1 child)

    by TheRaven (270) on Saturday November 23 2019, @08:52AM (#923754) Journal

    I'm talking about a tagged architecture with an ownership hierarchy so the capabilities in question are implicit through the parent pid. Again, this is going to require its own kernel.

    Ah, okay. We explicitly avoided this kind of design (which was used in the CAP computer and similar systems of that era) because it's impossible to implement efficiently in modern systems. One goal for a capability system is to enforce a provenance chain for capabilities. Systems like CAP explicitly provide this via a chain in memory, CHERI provides this via guarded manipulation (this means that CAP could provide the provenance chain at any point, whereas CHERI simply provides you with a proof that one existed). This means that CHERI avoids any associative lookups (other than the TLB, which is already present and is already a scalability bottleneck) on the fast path. The down side of this is that revocation becomes harder. Various people have explored building CAP-like designs with modern hardware, but the performance is always terrible. A TLB has to respond in a single cycle and the size of a TCAM that you can build that can respond that fast is quite limited (and consumes a lot of power!). Requiring a TLB-like mechanism for capability lookup is a performance killer.

    You probably want to be careful talking about implicit capabilities. One of the strengths of capability systems over ACL-based systems (such as the page table) is the principle of intentionality: it's not enough to own a right to be able to do something, you must actively choose to use that right. This provides a lot of protection against confused deputy attacks. In the context of memory safety, it's a huge difference: explicit use means that the pointer that you're using must refer to a specific object, implicit use means that a pointer to an object must exist. The latter does not protect you against overruns between objects.

    As for the instruction width thing, it's basically a way to parallelize decoding to some extent. If the instructions are fixed length and always store the address at the same place, you can fetch while still decoding. I probably got it off an Arvind's paper or something.

    RISC-V does this and, as with much of the rest of RISC-V, it is a case of premature optimisation. It helps with low-end microarchitectures but in high-end ones the i-cache tends to store decoded micro-ops or some hybrid (or even something completely re-encoded for better storage), so this kind of instruction encoding trick doesn't buy you anything.

    Trick is finding a target application that wouldn't need legacy while justifying a whole new everything. AR/VR seem like a good fit with those massive graphics compute requirement and incompatibility with regular phone and desktop UIs. It's also 5-10 years down the line since the screens aren't good enough right now so that's plenty of time for a big company to start working on it.

    The HoloLens ships with an NT kernel. The requirements on a kernel for VR / AR are not that different from any other use case: they need process isolation, access to accelerators, scheduling, memory management, and so on. The differences come at a much higher level.

    --
    sudo mod me up
    • (Score: 2) by RamiK on Saturday November 23 2019, @06:05PM

      by RamiK (1813) on Saturday November 23 2019, @06:05PM (#923895)

      Various people have explored building CAP-like designs with modern hardware, but the performance is always terrible...

      I'm only familiar with some of the attempts but from what I read they conclude it's possible but the work will be at the gate level rather than on a logic block level so it's not worth it while nodes are still shrinking. That is, a 5-10years design cycles just to get a product out that's a few generations behind is not economically viable so it will have to wait.

      You probably want to be careful talking about implicit capabilities

      Only mentioned it within the context of that specific hierarchical tag design.

      it is a case of premature optimisation

      The entire design needs to be evaluated before drawing that conclusion. Specifically to RISC-V, they don't want to encourage microops to begin with since they want people to contribute back design improvements instead of doing what Intel did with the x86 and move everything to the decoder. In my case, as a FOSS guy, I don't care for binary compatibility and am fine compiling between families.

      The requirements on a kernel for VR / AR are not that different from any other use case

      The GUI latency and VR studies I've read suggest a need for hard real time for VR/AR to grow out of being a novelty. But I'm not into sooth saying so I'll reevaluate this when the HoloLens starts selling outside the public sector for a reasonable price tag.

      --
      compiling...