Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Saturday February 16 2019, @02:08PM   Printer-friendly
from the so-that-means...-we-are-screwed dept.
 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 0) by Anonymous Coward on Saturday February 16 2019, @02:56PM (20 children)

    by Anonymous Coward on Saturday February 16 2019, @02:56PM (#802046)

    Can you make a CPU that runs fast and doesn't have this issue?

    One description of the problem is that the program can get the speculative parts of the CPU to gather protected information and then use it to adjust the CPU state.
    For example, a user program causing the CPU to read a bit in kernel memory and changing the cache state depending on the value.

    It is one thing to read beyond you privilege, but quite another to use the result.
    Perhaps results should include a tag of privilege and their use should require a matching tag from the instruction stream?

    That would require more logic, but hopefully not knowing something before you know it.

  • (Score: 3, Insightful) by Arik on Saturday February 16 2019, @03:12PM (11 children)

    by Arik (4543) on Saturday February 16 2019, @03:12PM (#802051) Journal
    "Can you make a CPU that runs fast and doesn't have this issue?"

    As I recall in the late 90s the fastest processors were more simplified, things like the DEC Alpha took a more direct path to speed and it did work. The market didn't reject them because they weren't fast. It was more about the industry addiction to blob compatibility.

    Someone more involved in modern RISC might chime in here.
    --
    - Sig not found. Self destruct initiated. Please clear the area.
    • (Score: 0) by Anonymous Coward on Saturday February 16 2019, @03:56PM

      by Anonymous Coward on Saturday February 16 2019, @03:56PM (#802062)

      I would assume that Alpha would have been exploitable in this manner.

      But as you point out, it was simpler, so perhaps easier to fix?

    • (Score: 3, Insightful) by RS3 on Saturday February 16 2019, @05:02PM (5 children)

      by RS3 (6367) on Saturday February 16 2019, @05:02PM (#802082)

      Absolutely agree.

      > It was more about the industry addiction to blob compatibility.

      My angle: driven by short-term mass profit.

      RISC CPUs are also vulnerable, although slightly less so. We're currently seeing a rise in RISC, much of it ARM and ARM-Cortex processors- Chromebook, phones, more RISC-based laptops being announced. I think if RISC, like Alpha, had taken off 20 years ago I suspect we'd be no better off, because the vulnerabilities affect RISC too, and profit-driven CPU development would have ignored the pitfalls.

      To me it's the same old story. I often cite the space shuttle Challenger disaster where the engineers pleaded to cancel the launch, but greedy managers overruled them. Not sure how that political structure evolved where the people who truly _know_ what's going on do not have final decision power. I suspect many engineers knew about the Spectre and Meltdown problems but were hushed. I'd love to see the results of a future investigation. My cynical side perceives that the general public is becoming sick of and numb to all of the vulnerabilities, data leaks, etc., and it's not "viral" anymore and they just want to hear about the next hot topic.

      Still trying to understand the details. Articles are too long, too deep, or too vague. My hunch at this point is that the cache controllers do not honor the CPU's memory protection boundaries. If that's the case, I doubt that even CPU microcode can fix it, but a future hardware design is needed that incorporates the cache controller fully into the memory control system.

      • (Score: 2) by Arik on Saturday February 16 2019, @08:15PM (4 children)

        by Arik (4543) on Saturday February 16 2019, @08:15PM (#802163) Journal
        "RISC CPUs are also vulnerable, although slightly less so."

        Which RISC CPUs use speculative execution?

        I don't remember either the Alpha or the PPC using it. Rather thought it was introduced specifically to make the superscalar x86 architecture work.
        --
        - Sig not found. Self destruct initiated. Please clear the area.
        • (Score: 2) by RS3 on Saturday February 16 2019, @08:51PM (2 children)

          by RS3 (6367) on Saturday February 16 2019, @08:51PM (#802175)

          Oh gosh, Arik, thanks for asking, but I'm not sure why this happens so much online: I never said RISC CPUs use speculative execution. I was only parroting what I read in many online articles about vulnerabilities, and they all say that RISC is also vulnerable.

          That said, after a quick search on terms like "RISC" "ARM" "vulnerable" you can find many articles. Many will state that ARM is vulnerable to Spectre but not Meltdown. Many refer to ARM's "speculative execution". ARM is generally considered RISC. I'm not sure how to define RISC vs. CISC, and it may be that speculative execution is okay to be included in a pedantically defined RISC processor. Here's some good reading on the subject- especially the paragraphs containing "RISC" and the AMD 29000 : https://en.wikipedia.org/wiki/Superscalar_processor [wikipedia.org]

          • (Score: 2) by Arik on Saturday February 16 2019, @09:04PM (1 child)

            by Arik (4543) on Saturday February 16 2019, @09:04PM (#802179) Journal
            Thanks for the reply. AC already provied an interesting link taking it back further. ARM is generally considered RISC and I knew some ARM architectures did it, but few if any implementations are "pure" so I thought it was a reasonable question.
            --
            - Sig not found. Self destruct initiated. Please clear the area.
            • (Score: 3, Interesting) by RS3 on Tuesday February 19 2019, @07:40AM

              by RS3 (6367) on Tuesday February 19 2019, @07:40AM (#803400)

              Sorry- verbal skills are my weakest suit. I try to be as clear as possible and people always find a way to misunderstand. Your question was absolutely okay- I was just trying to clarify what I wrote. I keep having a problem here (mostly here, and it just happened 2 more times) where people extrapolate from something I write, but then pin that extrapolation back on me, in a kind of accusatory way, and demand I defend something I never wrote, and is false and I disagree with. You weren't being accusatory at all; I'm just frustrated that I can't seem to write clearly the first time around.

              What I meant to write was: there are many vulnerabilities, not just speculative execution, so a CPU which does not do speculative execution can still be vulnerable.

              And repeating myself from earlier, it seems the problem is that the cache controller does not know memory protection boundaries, and if that's true, that's a horrible error. I'm still searching for a clarification on that possibility.

        • (Score: 2, Interesting) by Curlsman on Monday February 18 2019, @08:55PM

          by Curlsman (7337) on Monday February 18 2019, @08:55PM (#803173)

          Alpha EV6 (21264) used out-of order execution:
          https://people.cs.clemson.edu/~mark/464/21264.verification.pdf [clemson.edu]
          "The Alpha 21264 microprocessor is a highly out-of-order, superscalar implementation of the Alpha architecture."

          And https://en.wikipedia.org/wiki/DEC_Alpha [wikipedia.org]

          And the OpenVMS OS designers believe they are resistant:
          https://www.vmssoftware.com/updates.html [vmssoftware.com]
          "VSI OpenVMS is NOT vulnerable to this issue, primarily due to its different, four-mode architecture. Specifically, VSI OpenVMS is protected against CVE-2018-8897 because it does two things differently than other operating systems:

          1) OpenVMS doesn’t rely on the CS pushed in the interrupt stack frame to determine the previous mode. This means OpenVMS cannot be tricked into believing it was already in kernel mode when it was not, which is central to this vulnerability.

          2) OpenVMS uses a different method to switch GSBASE; OpenVMS always performs the switch and makes sure the user-mode GSBASE is always updated to match the kernel-mode GSBASE."

    • (Score: 1, Informative) by Anonymous Coward on Saturday February 16 2019, @08:38PM (3 children)

      by Anonymous Coward on Saturday February 16 2019, @08:38PM (#802170)

      Alpha had simple branch prediction. They wanted to go all in [ualberta.ca] with it for the EV8.

      • (Score: 2) by Arik on Saturday February 16 2019, @08:44PM (2 children)

        by Arik (4543) on Saturday February 16 2019, @08:44PM (#802174) Journal
        Interesting.

        Well the Amiga proved you don't actually need a fast CPU if you design everything else around it I suppose.
        --
        - Sig not found. Self destruct initiated. Please clear the area.
        • (Score: 2) by RS3 on Saturday February 16 2019, @08:58PM (1 child)

          by RS3 (6367) on Saturday February 16 2019, @08:58PM (#802176)

          That's a great point. I never had my hands on an Amiga but always admired them. I think they made better use of sort of distributed processing with more intelligent peripherals, but I may be wrong. Probably much cleaner tighter code too. I've always been surprised (annoyed) by how much work most CPUs do that could be done by auxiliary processors. I can't remember specifics, but I clearly remember machines where the main (and only) CPU did RAM refresh, CRT character scanning, etc.

          • (Score: 2) by Arik on Saturday February 16 2019, @10:09PM

            by Arik (4543) on Saturday February 16 2019, @10:09PM (#802200) Journal
            No, you're right.

            It had dedicated chipsets to offload much of the work onto, and tight code? Haven't examined the code though IIRC it was leaked a few years ago, but that was definitely my impression. This was the end of the classic microcomputer days, OS code wasn't something written in a high level language then trusted to the compiler, it was typically hand massaged by people that read 8-bit. Even application code normally got that treatment, after some profiling to see which loops got executed most often (we're all lazy and we'd often not get around to optimizing the bits that didn't get called often. Unless we were running out of storage space.)

            It had sound and video systems that pretty much did their job all on their lonesomes - the cpu pointed them in the right direction and they took it from there. The CPU doesn't need to be all that fast in that position - it just needs to do what a CPU traditionally does, what a Z80 did well enough and fast enough for most things. It executes the main logic of the program and runs the shows behind the scenes. You want a video? Point the vidcard at the file and tell it to go. Want to read a bunch of data from the HDD? Tell the controller what you need and where you want it put, check back every few cycles to see if it's done yet.

            --
            - Sig not found. Self destruct initiated. Please clear the area.
  • (Score: 3, Insightful) by Dr Spin on Saturday February 16 2019, @03:17PM (1 child)

    by Dr Spin (5239) on Saturday February 16 2019, @03:17PM (#802054)

    Can you make a CPU that runs fast and doesn't have this issue?

    Can you win the race if you cheat?

    Essentially, the risk is due to speculation or otherwise in one thread impacting performance in another. This does not need to be possible. However, if you allow a thread to use data that is in the cache because another thread put it there, then you are on the slippery slope to hell - even if you are destined to get there quicker, this might not be a good plan! Threads need to be wholly and completely isolated.

      "But it is not a multi-user environment" has been shown not to be a valid excuse - its not YOUR code running in the browser - the code in the browser belongs to a whole bunch of different malware promoters.
    While not using browsers at all might help, there are in fact, other scenarios (cloud serving) that are even higher risk.

    (Asking strangers to hold your wallet doesn't necessarily work out well either).

    --
    Guns don't kill thousands, presidents kill thousands.
    • (Score: 2) by RS3 on Saturday February 16 2019, @05:16PM

      by RS3 (6367) on Saturday February 16 2019, @05:16PM (#802089)

      The OS is supposed to "sandbox" user processes. That's been a big gripe of mine since 1990ish. Even generic Linux kernels don't do it properly, so we have "hypervisors" which are modified Linux kernels. Some hypervisors are forked Linux kernels, or written from scratch. The point is: IMHO ALL OSes should have hypervisor incorporated and hypervisors and OS "virtualization" (VMware, Xen, etc.) shouldn't be needed.

      That said, for a hypervisor, or any software-based memory protection to work, the CPU _HAS_ to honor memory boundaries, regardless of cache or speculative execution.

  • (Score: 2, Interesting) by Anonymous Coward on Saturday February 16 2019, @04:05PM (2 children)

    by Anonymous Coward on Saturday February 16 2019, @04:05PM (#802064)

    Maybe we can make a software fast enough?
    Seriously, There are lots of power used because some programmer had deadline too close and used another library on a framework on a library on a non-standard extension to the framework.
    From my experience in IT studies, students of 2nd year have a small assembler course. Most of them have no idea how the low-level program operates nor how to program standard devices. Maybe we should go back to teaching programmers, not users of libraries?
    I know temptation is big. Rich bosses buy better and better hardware for developers, for open source too, but it has a price, and speculative execution errors are a tip of the iceberg.

    • (Score: 0) by Anonymous Coward on Sunday February 17 2019, @01:49PM (1 child)

      by Anonymous Coward on Sunday February 17 2019, @01:49PM (#802490)

      Not sure where to drop this so I'm putting it here just because.

      Big family, lots of computing. Mac, Windows, Linux. Yes.

      Worked at IBM back in the mid-late '80s on big mainframes, RS6K, AS400...
      We discovered that clients could save millions by putting an AS400 emulator on an RS6000 and also benefit from monstrous performance improvements. Management killed that project and slapped a gag order on us real quick. (Transaction Protocol Council, TPC/A, TPC/C -- I was on the committee that created those tests and also the reports.)

      I'm the type who generally laughs at conspiracy theorists, but I know enough about the nuts and bolts of software, OS's and also upper management types (corporate and government) to recognize some peculiar patterns. When both my windows and Mac os's, at different locations, on different networks start glitching in the same way at the same times, something nefarious is definitely going on. (I don't use the Linux box enough to see the patterns there, so can't say about that one, but some of what systemd does seems awfully suspect to me)

      I solved the glitching problem by getting a 2007 Mac Pro and a 2008 Macbook Pro and using OSX 10.6 on both of them. It was like a breath of fresh air. These are the fastest computers in my house (and also I have up-to-date windows and a modern Macbook Pro and a 2013 Mac Pro as well.) BUT, The older machines are only faster if they are NOT connected to the Internet. The second I plug them into the net and launch a web browser -- even to just the google home page -- machine speed goes noticeably slower for all the software on it.

      So instead, the computers I use the most are at least ten years old and connected by wire to an internal network that is NOT connected to the internet. They run great. Added benefit -- I don't have to constantly re-learn how to use my software after every other update. When I need data from on-line, I get it with the sacrificial laptop and transfer the data via SD card. Its a little inconvenient, but now there's no more glitching. I can work in peace on a good, snappy system.

      If you use ANY modern computing system, you are being eavesdropped, monitored, manipulated, who knows what. The processor is only the tip of the iceberg. We live in dangerous times.

      (You remember that scene in the Snowden movie where they put their phones in the microwave? Amateurs! The phone can tell when it's in a faraday cage and can still hear soundwaves, and record them, and store them until it's not in a faraday cage anymore, then transmit them. I honestly don't care if they want to monitor me (they may even have a good reason for doing it) but I draw the line when they start impacting my ability to do good work by glitching my system up. That's when I cut them off, or at least, raise the bar so they have to work a little harder.)

      • (Score: 2) by takyon on Sunday February 17 2019, @09:37PM

        by takyon (881) <reversethis-{gro ... s} {ta} {noykat}> on Sunday February 17 2019, @09:37PM (#802622) Journal

        (You remember that scene in the Snowden movie where they put their phones in the microwave? Amateurs! The phone can tell when it's in a faraday cage and can still hear soundwaves, and record them, and store them until it's not in a faraday cage anymore, then transmit them. I honestly don't care if they want to monitor me (they may even have a good reason for doing it) but I draw the line when they start impacting my ability to do good work by glitching my system up. That's when I cut them off, or at least, raise the bar so they have to work a little harder.)

        Just wrap the phone in foil and put in the fridge or something. Then move into another room. Signal will be dead and it's unlikely to pick up your conversation unless it has year-2050-grade microphone arrays.

        --
        [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
  • (Score: 3, Interesting) by Anonymous Coward on Saturday February 16 2019, @05:45PM

    by Anonymous Coward on Saturday February 16 2019, @05:45PM (#802106)

    You would need a "reset cache".

    The problem is that the speculative branch does things that shouldn't have been done -- like fetch data from RAM that turns out being unnecessary. Then, check to see how fast it is to access something similar from RAM -- no page fault, no delay? It much already be in cache!... simplified, but for example. The problem is: data from RAM was cached. It's now in the cache. The fixes have been to flush the cache when switching program contexts, like from kernel code to user code.

    To fix it, you would need a reset-cache -- anything that was changed in the speculative branch would have to be reset to how it was before the speculative execution took place. So, double your CPU cache. Right? You could shrink it somewhat by changed-cache-block tracking, and keeping track of which execution path changed which blocks, for all the execution paths (usually two?). Resetting anything that changed means you would trigger the same page faults, have the same latency in accessing data as if the speculative execution never happened.

    You could also have a different copy of the cache for each speculative execution branch. Or maybe L1 cache or L2 cache only. Perhaps copy-on-write, completely copying the entire working cache from the execution branch every time there's a speculative execution event.

    It's expensive. Computationally, silicon, duplication of data, it's expensive.

    Bonus: a previously undiscussed speculative execution data-disclosure issue, cache clearing. If you know a way to cause a cache conflict, then you can cause the CPU cache to be cleared of certain data. Suppose you execute a branch that would cause data to be fetched from RAM and placed in a known location in CPU cache shared with other known data, and the other branch doesn't place the data there. Then try to access the original data again -- if it page faults (has to re-fetch because cleared from cache), then speculative data disclosure. Much slower than the other versions, but regardless.

  • (Score: 0) by Anonymous Coward on Sunday February 17 2019, @05:04AM (1 child)

    by Anonymous Coward on Sunday February 17 2019, @05:04AM (#802369)

    The problem is that the speculative execution unit is allowed to access RAM that the program is not authorized to access. Perhaps it is difficult or expensive to apply memory management to the speculative execution unit, but that seems to me to be the obvious solution.

    • (Score: 0) by Anonymous Coward on Sunday February 17 2019, @01:35PM

      by Anonymous Coward on Sunday February 17 2019, @01:35PM (#802486)

      "Perhaps it is difficult..."

      A system call is a coordinated dance between the cpu and os.
      The goal is to make it quick to move between the user and kernel space, but only thru carefully defined call gates.
      The dance takes many clocks and so is in many stages of the pipeline at once.
      Unless the S/W is able to greatly reduce the rate of system calls, speed requires that this involve speculation.

      Speculation means that memory cycles get started before a memory protection check.
      This means not verifying that if the address accesses protected memory, then the machine state and code that requests the cycle must be on the right side of the call gate.
      This is necessary because these things may not be available early enough in the pipeline to prevent the memory cycle start.
      This was thought to be ok because eventually the fruits from the cycle result would be ignored.
      These bugs demonstrate that they are not completely ignored because they can effect the cache state.

      This seems to me like the 3 bears.
      Checking before the cycle start is too early and very slow.
      Checking the state as we do is too late, but very fast.
      Checking after the cycle, but before using the result may be just right.

      In other word, make it ok to start the kernel memory read in user state, but tag the fruits of the cycle as to who and where they came from so that later stages in the pipeline can be more aware of what is happening.
      Later stages needs to include the cache update, but the difficult part is what else?