Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Wednesday September 13 2017, @06:03AM   Printer-friendly
from the going-back dept.

Return-oriented programming (ROP) is now a common technique for compromising systems via a stack-smashing vulnerability. Although restrictions on executing code on the stack have mostly put an end to many simple stack-smashing attacks, that does not mean that they are no longer a threat. There are various schemes in use for defeating ROP attacks. A new mechanism called "RETGUARD" is being implemented in OpenBSD and is notable for its relative simplicity. It makes use of a simple return-address transformation to disrupt ROP chains to hinder their execution and takes the form of a patch to the LLVM compiler adding a new flag.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by Wootery on Wednesday September 13 2017, @10:25AM (14 children)

    by Wootery (2341) on Wednesday September 13 2017, @10:25AM (#567157)

    The question is preventing corruption of the return address stack. How can you do this in software? We have to worry about buffer overflows/bad pointer arithmetic, and ROP-style shenanigans with the instruction pointer.

    Without banning pointer arithmetic, how can you prevent a malicious C program from messing with the return pointer? We could manage a shadow stack in software, and compare return address pointers before returning, but now we're back to square one except using software. We'd also somehow have to prevent corruption of the shadow stack, and again it's starting to look like a hardware solution is the way.

    LLVM SafeStack seems to be about creating two different stacks. [llvm.org] They concede limitations:

    protection against arbitrary memory write vulnerabilities is probabilistic and relies on randomization and information hiding. The randomization is currently based on system-enforced ASLR and shares its known security limitations

    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 2) by Virindi on Wednesday September 13 2017, @10:44AM (5 children)

    by Virindi (3484) on Wednesday September 13 2017, @10:44AM (#567162)

    I was merely suggesting moving the return pointer to someplace other than the data stack (to a return pointer stack). There would be no return pointer on the data stack. The idea is that by indexing into one stack, it would be hard to make that resolve to another stack, given guard pages, a 64-bit address space, and random stack locations. In this scenario there is no comparison of return pointers because there is only one return pointer.

    In my opinion, having two return pointers is only a marginal improvement. This is just like a type of ECC, where you fully duplicate all the data with the hopes that both copies can't be modified by the attacker. Okay I guess, but it still relies on exactly the same thing and adds almost nothing. Consider what the attacker must accomplish: in the separate stack scenario, the attacker must find a separate stack someplace else in memory. When the return addresses are DUPLICATED to a second stack, the attacker must find a separate stack someplace else in memory (an identical problem) as well as overwrite the standard return address at a known place on the same stack as the data. The former problem is identical to my scenario and the latter is nearly trivial.

    how can you prevent a malicious C program from messing with the return pointer?

    Note that all I am trying to defend against here is an attacker who can write data into a variable on the data stack with a chosen offset. This is a common scenario when a structure is stored on the stack and proper checks are not present when accessing it. I have not seen anything in this discussion about an attacker who can already execute arbitrary code; if they can do so why use ROP?

    • (Score: 2) by Wootery on Wednesday September 13 2017, @04:00PM (4 children)

      by Wootery (2341) on Wednesday September 13 2017, @04:00PM (#567260)

      I was merely suggesting moving the return pointer to someplace other than the data stack (to a return pointer stack).

      We could go even further and have four: one for arguments, one for returned values, one for temporaries, and one for return-to (instruction pointer) addresses.

      Ideally what we'd want is for the return-to stack to be inaccessible except for by the call/return instructions. This would be similar to what Intel's CET does, but without the duplication. (Of course, CET's hands are tied for backward compatibility.)

      The former problem is identical to my scenario and the latter is nearly trivial.

      Ignoring the write-prevention, yes. Presumably a software solution could do something similar, especially with kernel support; CET 'just' does something with page tables after all.

      I have not seen anything in this discussion about an attacker who can already execute arbitrary code; if they can do so why use ROP?

      You're right, that would be a very different question.

      • (Score: 2) by Virindi on Wednesday September 13 2017, @08:49PM (3 children)

        by Virindi (3484) on Wednesday September 13 2017, @08:49PM (#567452)

        Ignoring the write-prevention, yes. Presumably a software solution could do something similar, especially with kernel support; CET 'just' does something with page tables after all.

        You're right, not sure why I wasn't thinking about that. Clearly it is helpful to have the page containing the return address stack, have write protection against "normal" data write instructions.

        • (Score: 2) by Wootery on Thursday September 14 2017, @08:14AM (2 children)

          by Wootery (2341) on Thursday September 14 2017, @08:14AM (#567691)

          I figure you could do a similar thing in software (i.e. block 'ordinary' instructions from accessing that stack) using syscalls, unless I'm missing something.

          • (Score: 2) by Virindi on Thursday September 14 2017, @09:25AM (1 child)

            by Virindi (3484) on Thursday September 14 2017, @09:25AM (#567706)

            How? It would have to be written to when you make a call. I am not aware of some easy mechanism to achieve this.

            I double checked the Intel reference manual, and attempting a call instruction when the next stack location is a non-write page results in a processor exception. So if you want to implement this in software, you'd have to have the OS deal with this exception and check on whether a call instruction was actually made (and not a normal memory write), make the push, then resume. This seems like a heck of a lot of overhead for every call!

            Or am I missing something?

            • (Score: 2) by Wootery on Thursday September 14 2017, @01:49PM

              by Wootery (2341) on Thursday September 14 2017, @01:49PM (#567793)

              You're right, there would be an awful amount of overhead there.

              I don't think there's any efficient mechanism to restrict access to a page to only certain functions.

  • (Score: 3, Informative) by ledow on Wednesday September 13 2017, @11:07AM (7 children)

    by ledow (5567) on Wednesday September 13 2017, @11:07AM (#567169) Homepage

    Not everything can - or should - be done in software.

    To be honest, if I were designing a chip architecture nowadays, I would have a very clear separation of data and code, and I would have a lot of fine-grained control over what was writable and usable as a pointer or not and what instructions were allowed to operate on, and a complete separation of operational details (where that process is in memory, or where the stack is stored and how) from the actual software that is operating (sure, the OS is software, but there's no need to propagate such details to an OS, even, if you provide the correct hardware functionality).

    That there is a way to corrupt a stack, as any user or process, is unacceptable if we're talking about security. Hardware needs to enforce that (i.e. locking out regions of memory, etc.). But relying on software that changes and is the source of the compromise to then defend against that compromise is quite ridiculous a notion.

    There's no reason that processes even need to see actual real memory pointers either, or why they should ever be able to tweak them out of their given bounds, or certainly use them as a target to jump to for execution.

    In an era where we can compile raw, unchanged C to ECMAScript and have it work, there's no reason that we should still be using architectures that jump through all kinds of security hoops to boot but then leave everything up to the software trying to avoid touching itself.

    I'm even tempted to say - and I know someone will say that some System X does that, or that it would have overheads, or it would make IPC difficult - that process management should be hardware-controlled and only software-directed. "Hey, processor, please spin up a new thread using the new code memory region I requested earlier, this code start-point and these limits" - and then that code can't escape from those limits, be they memory regions, hardware access, CPU-usage, priority or whatever - and the controlling software cannot then use or interfere with those regions from that point onwards but can request, say, process-sleeping, prioritisation, communication or termination (and only the parent process is able to do that, traced right back to the boot process).

    Every OS reinvents the wheel to spin up a process and control it, and when we get it wrong things like this happen where processes can escape or modify other processes and their memory.

    Why is any process able to see the contents of an address range containing even its own code, or anything other than an unexecutable data area starting at literally address zero (which is offset and bounds-checked by the hardware to translate it to the real hardware address)?

    I get the "legacy code" bit, but when we went 64-bit we should have just changed the way the entire thing works, or at least add it as a processor mode option that can be switched into.

    • (Score: 2) by Virindi on Wednesday September 13 2017, @11:26AM (2 children)

      by Virindi (3484) on Wednesday September 13 2017, @11:26AM (#567175)

      process management should be hardware-controlled and only software-directed

      Hardware is not a magic bullet. The behavior you would be asking it to do would be quite complex, and there would be vulnerabilities identical to if the OS was controlling it. Except now, that behavior is both much less transparent and also harder to update.

      You are essentially asking for the OS to be implemented in CPU microcode. I, for one, have a problem with that on many levels! I'd rather the complex tasks of security on my system be visible to me, with hardware based operations which are as simple as possible. (Yes, the security model in x86 has gotten out of hand. But still it is not as bad as the scenario you suggest.)

      • (Score: 2) by ledow on Wednesday September 13 2017, @12:02PM (1 child)

        by ledow (5567) on Wednesday September 13 2017, @12:02PM (#567184) Homepage

        I'd rather the complex tasks of security be extremely visible.

        As in "process killed hard because it tried to do something it shouldn't be allowed to do, and hardware gave it zero choice but to immediately terminate." Not just "oh, just set the service to auto-restart silently in the background if it dies and make sure it runs as admin".

        Software has proven woefully inadequate at this. It's still possible to BSOD the latest OS. OS control basically consists of a user "asking" the OS to kill a process which then goes through a complicated rigmarole of processing before deciding that the process should die, and then spend half its life cleaning up after it. I'd rather it just died. Literally.. boom... not a single instruction more executed, memory removed from its possession, every pending action or callback gone forever, possibly the controlling process notified.

        I don't see how the hardware would have vulnerabilities such as this (others, sure). If the address space of the stack is LITERALLY NOT AVAILABLE to the process, no matter what it asks for, then that's the end-game.

        If you'd argued for it making debugging harder, yes, we'd need to put in debug modes into the processors for that.
        If you'd argued that it would affect performance - of course it would. But so does all the stack-tricks and ASLR and so on.
        If you'd argued that you want control of the microcode because of the processor snooping on things it shouldn't, I could agree, but that's a game we've already lost in the conventional OS world too.

        If the argument is "I can't see into my processes, or what stack was used, or how the processor set up that thread", then that's precisely the point, I feel.

        Too much is reliant on a bit of software that can be changed and overwritten detecting whether other bits of software have been changed or overwritten.

        • (Score: 2) by maxwell demon on Wednesday September 13 2017, @07:19PM

          by maxwell demon (1608) on Wednesday September 13 2017, @07:19PM (#567402) Journal

          As in "process killed hard because it tried to do something it shouldn't be allowed to do, and hardware gave it zero choice but to immediately terminate." Not just "oh, just set the service to auto-restart silently in the background if it dies and make sure it runs as admin".

          You obviously have no idea what you are talking about. When a process is "silently restarted," it is killed hard. And some other process notices that the process is gone, and starts a new process for the same program. And there's zero the hardware can do against this, except possibly disallow the OS from starting a new process. But hardware that does that would be 100% useless, as you couldn't do anything with it.

          --
          The Tao of math: The numbers you can count are not the real numbers.
    • (Score: 2) by Pino P on Wednesday September 13 2017, @05:33PM

      by Pino P (4721) on Wednesday September 13 2017, @05:33PM (#567312) Journal

      Why is any process able to see the contents of an address range containing even its own code

      I can think of three reasons.

      Program loader
      The operating system process loading application code into RAM has to see the application code it's loading. In iOS, for example, this process is privileged and unique, and it verifies digital signatures.
      JIT engine
      A process using just-in-time recompilation has to see the code it's building. In iOS, for example, this process is privileged and unique: the only JIT recompiler allowed to run is the WebKit JavaScript engine.
      Literal pools
      The ARM instruction set has a limited range for immediate values: 8 bits rotated by some even number of bits 0 to 31. The workarounds are to split a large constant into a set of immediate load and add instructions or to load large constants from a literal pool [wikipedia.org] placed between functions and accessed using a PC-relative addressing mode. This may be impractical to avoid on ARM, the instruction set of older versions of iOS. AArch64 changes this by allowing 1 MB offsets [arm.com], which in principle would allow literal pools to end up in separate MMU pages from code.
    • (Score: 2) by meustrus on Wednesday September 13 2017, @05:50PM (1 child)

      by meustrus (4961) on Wednesday September 13 2017, @05:50PM (#567330)

      Why is any process able to see the contents of an address range

      Because the available abstractions historically never perform well enough for the few performance-critical applications. Bitmap processing is one area where going one pixel at a time is really slow with most libraries, and this has typically been solved by giving developers access to the memory underlying the bitmap.

      It probably isn't a law that abstractions will always perform worse. But we used to have access to the underlying hardware, and a lot of high-performing code was developed with this access. When the new abstraction performs poorly, the easy solution is to go back to the way it used to be so they can use the highly performing code regardless of its lack of safety.

      I get the "legacy code" bit, but when we went 64-bit we should have just changed the way the entire thing works

      Intel tried to do just that; they looked at the x86 instruction set, saw cruft, and went about designing Itanium to fix the shortcomings discovered since x86 was created and extended. But designing things right takes time, and while Intel was busy with that exercise AMD rushed to market a bolted-on 64-bit extension to x86. No longer first to market, Itanium was at a severe disadvantage by the time it came out with its incompatibility with existing x86 bytecode.

      Market forces will always prevent us from having nice things. And the market prefers performance and backwards compatibility over correctness, even when "not correct" means "not secure". It's why Linux and Windows dominate despite many more correct alternatives existing.

      --
      If there isn't at least one reference or primary source, it's not +1 Informative. Maybe the underused +1 Interesting?
      • (Score: 2) by Wootery on Thursday September 14 2017, @08:18AM

        by Wootery (2341) on Thursday September 14 2017, @08:18AM (#567695)

        Market forces will always prevent us from having nice things.

        I don't know about that - ARM is doing ok. MIPS, POWER, SPARC, SuperH, Alpha... not so much. I hope there's a future for RISC-V.

        In the GPU world they're essentially free to re-architect their hardware every generation, as everything is JIT compiled.

    • (Score: 2) by maxwell demon on Wednesday September 13 2017, @07:12PM

      by maxwell demon (1608) on Wednesday September 13 2017, @07:12PM (#567394) Journal

      (sure, the OS is software, but there's no need to propagate such details to an OS, even, if you provide the correct hardware functionality).

      So you think hardware is never buggy? If an OS has a bug, I can just install an update. If hardware has a bug, well, bad luck.

      --
      The Tao of math: The numbers you can count are not the real numbers.