Stories
Slash Boxes
Comments

SoylentNews is people

posted by hubie on Friday March 10 2023, @05:29AM   Printer-friendly

There's never enough time or staff to scan code repositories:

Software dependencies, or a piece of software that an application requires to function, are notoriously difficult to manage and constitute a major software supply chain risk. If you're not aware of what's in your software supply chain, an upstream vulnerability in one of your dependencies can be fatal.

A simple React-based Web application can have upward of 1,700 transitive NodeJS "npm" dependencies, and after a few months "npm audit" will reveal that a relatively large number of those dependencies have security vulnerabilities. The case is similar for Python, Rust, and every other programming language with a package manager.

I like to think of dependencies as decaying fruit in the unrefrigerated section of the code grocer, especially npm packages, which are often written by unpaid developers who have little motivation to put in more than the bare minimum of effort. They're often written for personal use and they're open sourced by chance, not by choice. They're not written to last.

[...] Not all hope is lost. For known (reported and accepted) vulnerabilities, tools exist, such as pip-audit, which scans a developer's Python working environment for vulnerabilities. Npm-audit does the same for nodeJS packages. Similar tools exist for every major programming language and, in fact, Google recently released OSV-Scanner, which attempts to be a Swiss Army knife for software dependency vulnerabilities. Whether developers are encouraged (or forced) to run these audits regularly is beyond the scope of this analysis, as is whether they actually take action to remediate these known vulnerabilities.

However, luckily for all of us, automated CI/CD tools like Dependabot exist to make these fixes as painless as possible. These tools will continually scan your code repositories for out-of-date packages and automatically submit a pull request (PR) to fix them. Searching for "dependabot[bot]" or "renovate[bot]" on GitHub and filtering to active PRs yields millions of results! However, 3 million dependency fixes versus hundreds of millions of active PRs at any given time is an impossible quantification to attempt to make outside of an in-depth analysis.

[...] Did you install your packages from the command line? If so, did you type them in properly? Now that you've installed your dependencies "correctly," did you verify that the code for each dependency does exactly what you think it does? Did you verify that each dependency was installed from the expected package repository? Did you ....

Probably not, and that's OK! It's inhumane to expect developers to do this for every single dependency. The best bet for software developers, software companies, and even individual tinkerers is to have some form of runtime protection/detection. Luckily for us all, there are detection and response tools that have relatively recently been created which are now part of a healthy and competitive ecosystem! Many of them, like Falco, Sysdig Open Source, and Osquery, even have free and open source components. Most even come with a default set of rules/protections.


Original Submission

 
This discussion was created by hubie (1068) for logged-in users only, but now has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 4, Interesting) by Rosco P. Coltrane on Friday March 10 2023, @04:02PM

    by Rosco P. Coltrane (4757) on Friday March 10 2023, @04:02PM (#1295512)

    I code in Python for a living. The stuff I code always import the barest minimum of modules, especially if they're not part of the core Python distribution.

    I only import stuff that really make life easier. For instance, the ever-useful pySerial. Obviously I'm not gonna recode cross-platform serial routines.

    I tend to import wildly used modules, or ultra-specialized modules distributed but companies themselves. For example, Python bindings for Basler cameras distributed by Basler [pythonforthelab.com], or Python modules to talk to LabJack DAO devices distribyted by LabJack [github.com]

    For obscure modules that have added value but that I don't quite trust, I download them, review them and then only use the reviewed copy from our intranet. For example, the PyGPD3303S module [pypi.org] to talk to GPD3303S USB power supples, or the Prologix GPIB-to-Ethernet Python wrapper [github.com] to talk to Prologix GPIB ethernet adapters: those two modules are obviously made by two nice dudes on their spare time, but I don't want to find out the hard way one day that their repos has been taken over by hackers.

    Finally, for really simple stuff, I just recode the functionality myself. For example, all of our networked products use a very small subset of IPv4. I'm not going to import the ipaddress module just to do a few simple calculations to figure out a broadcast address or a network address. I just made my own ultra-simple, ultra-limited IPv4 class that suits our needs, doesn't require any external import and loads faster. It's a 20-liner and I know exactly what it does.

    Do that and you won't end up with "1,700 transitive NodeJS". I suspect those who depend on other people's work to that extent are code monkeys who are too lazy to implement simple things, or too dumb to do them themselves.

    Starting Score:    1  point
    Moderation   +2  
       Interesting=2, Total=2
    Extra 'Interesting' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   4