Stories
Slash Boxes
Comments

SoylentNews is people

posted by takyon on Wednesday January 09 2019, @02:52PM   Printer-friendly
from the starving-programmers dept.

Bruce Schneier thinks the problem of finding software vulnerabilities seems well-suited for machine-learning (ML) systems:

Going through code line by line is just the sort of tedious problem that computers excel at, if we can only teach them what a vulnerability looks like. There are challenges with that, of course, but there is already a healthy amount of academic literature on the topic -- and research is continuing. There's every reason to expect ML systems to get better at this as time goes on, and some reason to expect them to eventually become very good at it.

Finding vulnerabilities can benefit both attackers and defenders, but it's not a fair fight. When an attacker's ML system finds a vulnerability in software, the attacker can use it to compromise systems. When a defender's ML system finds the same vulnerability, he or she can try to patch the system or program network defenses to watch for and block code that tries to exploit it.

But when the same system is in the hands of a software developer who uses it to find the vulnerability before the software is ever released, the developer fixes it so it can never be used in the first place. The ML system will probably be part of his or her software design tools and will automatically find and fix vulnerabilities while the code is still in development.


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 0) by Anonymous Coward on Wednesday January 09 2019, @04:02PM

    by Anonymous Coward on Wednesday January 09 2019, @04:02PM (#784152)

    I've been quietly suggesting this to a few friends for the past 3-4 years who were into ML.

    It's the next big leap forward in cost reducing bug fixing and reverse engineering.

  • (Score: 3, Interesting) by choose another one on Wednesday January 09 2019, @04:31PM (2 children)

    by choose another one (515) Subscriber Badge on Wednesday January 09 2019, @04:31PM (#784157)

    Makes sense - ML is getting very good at learning games given only the rules and the "win" notification, see e.g. AlphaGo Zero.

    if we can only teach them what a vulnerability looks like

    This is where I slightly disagree (as well as with his space shuttle example*), teaching is not necessary - if (when) it gets in, what it did tells you what a vulnerability looks like. ML will give us a more "intelligent" direct and targeted, and likely much faster or at least more comprehensive for the same test time, version of fuzzing.

    But when the same system is in the hands of a software developer who uses it to find the vulnerability before the software is ever released, the developer fixes it so it can never be used in the first place. The ML system will probably be part of his or her software design tools and will automatically find and fix vulnerabilities while the code is still in development.

    There is another bit missing here - if the war is between the attackers and the developers, the attacker has massive financial rewards, the developer that finds bugs that then require fixing costing time and missed deadlines gets essentially no reward. This disparity is going to extend to the purchasing power for ML tools - until they become cheap-as-chips it will be mostly attackers who can justify investing in them, and many many devs will not have access. Still going to be a very one-sided war for some time.

    [ *the shuttle wasn't an exception to prioritising fast and cheap over good, it simply failed to prioritize _any_ of those, arguably because it actually prioritized slow and expensive (maximum consumption of govt. money) over everything else. I think you could actually cover the entire development costs of the falcon program, and a flight, for the cost of one "reusable" shuttle refurbishment. ]

    • (Score: 5, Insightful) by hopdevil on Wednesday January 09 2019, @05:25PM (1 child)

      by hopdevil (3356) on Wednesday January 09 2019, @05:25PM (#784189)

      People tend to think ML will solve their problem, especially if they don't understand how ML works. Or even what their problem is. Sorry Schneier, but this is out of your league.

      I know of several bug finding "systems", some of which use algorithms... and they are all terrible. The false positive rate is through the roof, wasting more developer time and forcing annoying coding practices to avoid the bug report. And they don't find the actual bugs.

      if we can only teach them what a vulnerability looks like

      If we can only teach humans what a vulnerability looks like. The first issue with applying ML in solving this (or any) problem; if you can't define the outcome in a statistically significant way, you will get garbage out. ML isn't magic, think of it as a messy statistics framework.

      The second issue is that the "win condition" would need to be an actual win condition, but in pure static code analysis you cannot really get this -- there is no feedback loop. To have an actual true positive for a vulnerability, you need to have actually crossed some protected boundary. What is that boundary? Usually this is very unclear or subtle at best, even to the developer that wrote the code.

      As an example, think doing what Schneier is suggesting. You would need to compile each block of code independently (and together) and test each variable. Sounds like fuzzing, which technically works, except where it doesn't. Maybe then the ML can help the fuzzer by tracing which code is getting executed and which values need to be in the input to reach more lines of code, but still you aren't using ML for vulnerabilities but throwing shit until something sticks. You will miss a lot, and ML algorithms are very expensive computationally.

      Microsoft has probably the best progress thus far, but their solution requires a crazy amount of tweaking and isn't doing just static code analysis. They have some papers published but for the most part the technology is just not there yet.

      Also, these guys came pretty close: https://www.darpa.mil/program/cyber-grand-challenge [darpa.mil]

      Note: I am a security researcher. I have worked with many people and systems attempting to wield this sorcery.
      • (Score: 0) by Anonymous Coward on Wednesday January 09 2019, @08:34PM

        by Anonymous Coward on Wednesday January 09 2019, @08:34PM (#784270)

        We did pretty darn well in the Cyber Grand Challenge. Our code would fix bugs, then exploit them for attacking while patching them for defense. It really works.

        That said, there is so much variety in the real world. Humans aren't going away any time soon. We still hire lots of people to manually go over disassembled binary executables and crash dumps. Email me at users.sf.net, account name albert, if you are a US citizen and want to do that. There is no shortage of need for the people with low-level skills who can make sense of binary blobs and register state.

  • (Score: 3, Insightful) by fyngyrz on Wednesday January 09 2019, @04:52PM (7 children)

    by fyngyrz (6567) on Wednesday January 09 2019, @04:52PM (#784168) Journal

    the ML system will probably be part of his or her software design tools and will automatically find and fix vulnerabilities while the code is still in development.

    I don't want something to go in and "fix" things; I'm very happy to have something point them out, but I want to do (or at least confirm) the fixes myself so I know exactly how they integrate (or don't) with what I was trying to accomplish. Not to mention learning to anticipate them and not cause them in the first place. Having such a tool is obviously valuable. Depending on it seems like a recipe for disaster to me.

    --
    We should start referring to "age" as "levels."
    So when you're LVL 80, you're awesome.

    • (Score: 2) by Runaway1956 on Wednesday January 09 2019, @04:58PM

      by Runaway1956 (2926) Subscriber Badge on Wednesday January 09 2019, @04:58PM (#784173) Journal

      I'm not even a developer, but I can see your point, and agree with it. You've built it, tested it, and turned the ML loose on it. It finds a "vulnerability" which it fixes - and your software no longer does the magic that it was designed to do. Yeah, you want to confirm, maybe allow the ML to "fix" it in a sandbox, test the "fix", and see HOW it "fixed". If you don't like what the ML does, then you can fall back and try to fix it yourself. It's great to have help, but you can't just allow the ML to take over the development, no matter how bad the exploit.

    • (Score: 2) by J_Darnley on Wednesday January 09 2019, @05:42PM (1 child)

      by J_Darnley (5679) on Wednesday January 09 2019, @05:42PM (#784200)

      You have a bug in a format parser so clearly the way to fix the bug is to remove the format parser.

      I swear it was just yesterday that I was reading about an ML tool that was hiding information in plain sight by encoding it in high frequency detail. That might have been last week but I'm sure it was yesterday that some other "AI" was cheating.

      • (Score: 2) by fyngyrz on Wednesday January 09 2019, @11:09PM

        by fyngyrz (6567) on Wednesday January 09 2019, @11:09PM (#784327) Journal

        I swear it was just yesterday that I was reading about an ML tool that was hiding information in plain sight by encoding it in high frequency detail.

        Are you thinking of this? [techcrunch.com]

        --
        Surely not everybody was kung fu fighting?

    • (Score: 2) by Thexalon on Wednesday January 09 2019, @07:20PM (2 children)

      by Thexalon (636) on Wednesday January 09 2019, @07:20PM (#784235)

      Among other reasons, you now have an easy way to intentionally introduce backdoors into all kinds of software: Compromise the ML auto-fix system.

      --
      The only thing that stops a bad guy with a compiler is a good guy with a compiler.
      • (Score: 2) by maxwell demon on Wednesday January 09 2019, @08:54PM (1 child)

        by maxwell demon (1608) on Wednesday January 09 2019, @08:54PM (#784272) Journal

        The point where it really gets interesting is when the ML program is allowed to fix its own code …

        --
        The Tao of math: The numbers you can count are not the real numbers.
        • (Score: 2) by Thexalon on Wednesday January 09 2019, @09:34PM

          by Thexalon (636) on Wednesday January 09 2019, @09:34PM (#784285)

          Then we're starting to get into Reflections on Trusting Trust [acm.org] territory.

          --
          The only thing that stops a bad guy with a compiler is a good guy with a compiler.
    • (Score: 1) by dkman on Thursday January 10 2019, @05:57PM

      by dkman (4462) on Thursday January 10 2019, @05:57PM (#784589)

      Yea. I'm very happy if it can check the code and identify points of risk/bugs. I'm happy if it can suggest a fix. But I'm very unhappy if it goes injecting it's own fix. I'm the one that needs to maintain that code and I want to be able to read it. 2 years from now when I come across some "WTF is this?" code that isn't commented and isn't written in my style I'm not going to be happy about it.

  • (Score: 2) by DannyB on Wednesday January 09 2019, @09:25PM

    by DannyB (5839) Subscriber Badge on Wednesday January 09 2019, @09:25PM (#784280) Journal

    Machine Learning that identifies patterns unique to tech-illiterate, gullible, naive, and highly exploitable USERS.

    Wouldn't those targets be just as valuable, perhaps more so, than software vulnerabilities.

    You can protect against an implementation error or a design flaw. But can you really protect from an idiot? (yes. Yes. I said Yes. Ooops, I didn't mean to delete that! It must be the vendor's fault! Blame Canada! Etc)

    By looking at enough social media data it might be possible to spot (1) suckers who can be conned out of money sent to nigeria, and (2) walking security exploits that will send their password to "the IT guy" who called them to help fix a problem they didn't know existed.

    --
    To transfer files: right-click on file, pick Copy. Unplug mouse, plug mouse into other computer. Right-click, paste.
  • (Score: 0) by Anonymous Coward on Thursday January 10 2019, @02:41AM

    by Anonymous Coward on Thursday January 10 2019, @02:41AM (#784427)

    In the end the practical solution to bugs is approaching from the other end! Write perfect code, simple as that. The problem with the traditional poke'n'hope approch is you can prove a bug exists but you can never prove by testing that bugs don't exist...

    Yes doing it the formal math way will be expensive but it will be GOOD. No more software fuckups, ever. Those tend to be expensive as well and occur at inconvenient times... Having said that the hardware fuckups will be there with us for all eternity... :)

    If you want to throw in machine learning, teach the box to do formal proofs. The how part is left as an exercise for the reader.

(1)