Visibility is half the battle when keeping your software projects secure, but the right tool can help you leave few vulnerability stones unturned.
This episode explores an open source software vulnerability scanner called CVE Binary Tool that searches binaries and component lists in your project and reports back known vulnerabilities based on data from NIST’s National Vulnerability Database (NVD) list of Common Vulnerabilities and Exposures (CVEs).
Our guest is Dr. Terry Oda, a security researcher at Intel and the lead maintainer of CVE Binary Tool, and co-host Chris Norman, Intel Open Source Evangelist, joins us to explore the inner workings of the project.
Katherine Druckman
What security problem does CVE Binary Tool address?
Dr. Terri Oda
In theory, it does the easiest part of security. It asks: 'Are there any known vulnerabilities to your product?’ and then tells you. That's a lot easier, I hope, than figuring out whether there are unknown vulnerabilities, finding issues in code, or finding this weird way you can combine things together in the actual implementation, but unfortunately it turns out it's not that easy. Trying to figure out what components you're using, what vulnerabilities they have, which ones have been mitigated and which ones haven't, and which ones are completely spurious stuff that someone put in the database last week that’s breaking all your builds is more challenging than you think. In theory, it just tells you if you have known vulnerabilities so you can do mitigations or update components and fix them.
Katherine Druckman
It doesn't address the vulnerabilities. It just tells you that you have them?
Dr. Terri Oda
It’s similar to a virus scanner. It tells you if there is a virus there or not, and it does it in a similar way. It uses a bunch of signatures to figure out…Does this look like GCC version whatever, or does this look like something else? It's an easier problem than virus scanning because virus writers try to hide what they are and open source projects are generally very happily telling you “I am Fubar 1.2.3.” It's right there in the code, so the signatures are much easier, but the concept is similar.
Chris Norman
You had a good analogy about going through a cardboard box, pulling things out…
Dr. Terri Oda
I should have brought my cardboard box for this one, but yeah, [the tool approaches things] in a bunch of different ways. So, you can have no idea what's in your mystery [metaphorical] cardboard box, pull things out, read the serial numbers and figure out, ‘What is this thing?’ or ‘Did it get recalled?'. Or maybe you pull a spice mix out of the box, with a list of all the things in it and then drill down. A spice mix says it has cinnamon in it, but which cinnamon? That doesn't tell me where it came from and whether it’s been contaminated. The final way is more like a recipe book. A lot of open source projects provide a recipe for building. So, you need to install these things, then we need to do these things. We get that list and figure things out from the instructions, and, like a recipe, you're going to see eggs, but it doesn't say which specific brand of eggs or which specific brand of butter. We figure out what your computer would have gotten [from the instructions] and whether it’s safe or not.
Katherine Druckman
The number of dependencies in any software application today is mind-boggling and growing. I read something in a report generated in the past year that the number is increasing several times over just on average.
Dr. Terri Oda
We joke about it being the fault of JavaScript* because the average JavaScript project has like 200 components, but they’re all very small. They’re not nearly as large, whereas your average C project probably has like 10 or 20. And even with my Python* project I have to generate a new list of dependencies. I generate five of them, one for each version of Python, and it generates an extensive list of all the components that get updated every week. It’s not one and done. I didn't compile it, put it on a CD and ship it, and then have to decide whether I need to issue a recall. It’s an ever-growing, changing experience of software, unfortunately, for a lot of modern languages.
Katherine Druckman
NPM-managed projects can have a staggering number of dependencies, ballooning into thousands. How does anybody juggle this? I've been a maintainer of an open source project who doesn’t want to admit how unfamiliar I was with just the sheer number of dependencies and the details.
Dr. Terri Oda
Yeah, we used to have a little tick box that required people to prove they had a reproducible build of their software. It was a requirement at Intel because it was so rare even there.
Someone, somewhere was required to document how things were built, but it's probably the norm that we don't have fully reproducible builds of any sort now with all those components. If you're tracking 400 components, of course you're not going to read the vulnerability list for each and every one of them, and that's why people often miss when something becomes vulnerable: it was a dependency of a dependency of a dependency. So, you trusted them to do it. Especially when reviewing JavaScript components for the ‘allow’ list at Intel, we found many contained subcomponents that were just being automated by bots.
For example, they made a little function with a little bot that upgraded the version periodically, but that was 80% of the commits in the last year. Obviously, you wonder if it was maintained. Someone wrote the bot, but the bot was the top contributor to all the products on NPM when we were investigating that to see how maintained they were and how much risk we had.
It doesn't help that the bot is named 'Gerald,’ so it looks like a human name. It's run under someone’s account.
Katherine Druckman
It's an issue for everybody at all skill levels. It's tough to keep track of this stuff, so enter the CVE Binary Tool. What’s the origin story?
Dr. Terri Oda
My boss's boss was really pissed about Heartbleed, and we had versions of OpenSSL* everywhere, so he wrote a little Python script that just told you what version of OpenSSL was used, and that remediated a bunch of errors.
Then we bought additional tools. After more remediation with those he said, 'Oh, hey, you have experience in open source project development. Even though we hired you as a security person, why don't you take this open source?’ We started the process, and the immediate response was, 'But what if it finds something? What if it finds something in us? We're not sure we fixed everything.' A few months later we were pretty sure, so it started a couple of years before the first open source release.
Since then, we've found a lot more -- four or five more high profile bash things and other stuff. That led to adding more and more stuff, and then we realized we couldn't just keep running this as a bunch of regular expressions, we needed to build a framework. By the time it went public, it was set up so you could say, 'Hey, if you want to write a new checker for something with a bunch of expressions and signatures, then you can just do it!' We hoped to put it out in the world and people could adjust to scan for just what that they wanted to scan. I admit I was terrified that no one was going to be interested, but I started with 10 checkers and, as of this morning, I think I merged commit number 257. We have a single contributor who has committed over 100.
Katherine Druckman
Thank you, whoever you are!
Dr. Terri Oda
That’s one of the problems. Because we're so careful about privacy, I have no idea how people use this, but I can tell from what we're getting that this person’s scanning a bunch of probably embedded Linux* devices and routers. One of the test suites he uses is the openWRT* package list, so I hope we're fixing some networks in Europe somewhere.
Chris Norman
Sounds like you don’t have much visibility on how many people are actually using the tool?
Dr. Terri Oda
No, I only find out when it doesn't work for them!
Chris Norman
That could be a good sign, too. The bugs and the features can give you an indication that people are are actually using it.
For more of this conversation and others, subscribe to the Open at Intel podcast:
- Openatintel.podbean.com
- Google Podcasts
- Apple Podcasts
- Spotify
- Amazon Music
- Or your favorite podcast player (RSS).
About the Author
Katherine Druckman, an Intel Open Source Evangelist, is a host of podcasts Open at Intel, Reality 2.0 and FLOSS Weekly. A security and privacy advocate, software engineer, and former digital director of Linux Journal, she's a long-time champion of open source and open standards.