For me, vulnerability research looks pretty similar to how it did over a decade ago. Maybe there are some slight differences in the tools I use but the overall process is effectively the same; read through the source code, trace data flow, perform reachability analysis, use a debugger to verify behavior, etc. Every vulnerability researcher is going to have their own tools and methods that work for them. For some, this may be as simple as a text editor and grep, an IDE, or a debugger. As the complexity increases tools like IDEs and now language servers are helpful, providing basic built in analysis functionality such as “go to definition”, “find uses”, etc. These features help to navigate the code more quickly but in the end, it’s still a very manual process.

Engineering and security teams have had access to static analysis or SAST tooling for years. As a concept, they’re nothing new. However we never really saw these gain much adoption by vulnerability researchers. Historically these have been expensive and not particularly useful due to a high false positive rate and lack of flexibility in modeling complex vulnerability classes. They weren’t practical or particularly useful for hunting for vulnerabilities and typically were only useful for finding the most egregious mistakes. They were slow, generally pretty static in terms of what they could detect, and the sort of thing you’d set up to run in a nightly job or at best run as part of your CI process before merging a branch.

Within the static analysis space a new generation of tooling has become available. Unlike the more traditional tooling these are typically open source or free to use on open source software. Where you’d typically deploy what you were given from the vendor and maybe mark up the source code to tell the tool to ignore false positives these are designed with extensibility in mind. They’re designed to allow the end user to relatively easily write their own queries or rules to detect anything from basic style mistakes to complex vulnerability conditions. They allow the user to model syntactic or semantic conditions, search through a code base for code that matches, and quickly iterate in real time to tune what is discovered

While I don’t think these tools will ever replace manual analysis I do think they open the door for new workflows and optimizations to the process just like IDEs, decompilers, and other tooling have. Going Beyond Grep is meant to provide a fairly real time account of learning these tools and leveraging them to develop a workflow that makes it easier and faster to discover vulnerabilities at scale. I hope to share this, encourage others to explore this space as well, and learn from others who are exploring this path as well.

While there is definitely some content out there covering this, a lot of what is produced by the vendors themselves. You typically see success stories but you don’t see the failures, the problems, and where the tools need improvement. So far, my interaction with these vendors have been fairly positive, they’ve been receptive, and welcomed feedback. I hope that this can help serve as an open line of communication to provide that feedback. Unfortunately while this is a use case the vendors support it definitely isn’t their primary focus. They primarily target these tools as a SAST platform for use in a CI workflow and that’s how it’s sold and licensed. I do hope that will start to change if and when more vulnerability researchers become interested.

There are a handful of these tools out there now and it’s not unreasonable to believe more may be developed targeting new niches and attempting to fill gaps that are discovered. For the duration of this blog we’re probably going to stick primarily to discussions of CodeQL and Semgrep as they are currently the most mature and seem to be the current market leaders.

From a design perspective there’s two main differences that are worth calling out:

  1. Is a pre-analysis step required before queries/rules can be run?
  2. How are the queries/rules written?

Is one approach better than the other? I don’t think so. I think there are benefits and drawbacks to both approaches and really, I see both approaches as complementary to each other. We’ll definitely dive deeper into this in later posts.

In the end, during the course of exploring and documenting my progress here I hope to answer two questions:

  1. Can I develop a workflow using tools like CodeQL and Semgrep that makes me a better/more efficient vulnerability researcher that is worth the upfront cost?
  2. Is it practical and is there value in leveraging an approach of modeling a vulnerability generally and surveying a large number of targets for one where it exists rather than choosing a single target and looking for an exploitable vulnerability in it?