Introducing Semgrep and r2c

by Isaac Evans on October 29, 2020

Free, fast, open-source, offline, customizable. These are not often words that describe code scanning tools, and that's a shame.

We founded r2c to bring world-class security tools to developers based on our conviction that software will run the most exciting parts of the future: everything from medical equipment to robots to autonomous cars. The security process should not be the foe but rather the enabler of rapid software development. If developers lack tooling that is easy to set up and understand—or if a developer has to convince their manager to spend a few million dollars on advanced security tools each time they change jobs, the future is bleak.

Before founding r2c, we worked on security and developer tools for large companies and governments. It was eye-opening to see that despite massive budgets, their security programs were generally a generation or more behind the tech giants. When it came to security tools for developers, most teams were jaded about scanning code for vulnerabilities; they hated the tools they had to use and usually ignored them beyond doing the minimum necessary to satisfy a compliance checkbox.

What about code scanning at places like Facebook, Apple, Amazon, Netflix, and Google? They don't generally use traditional commercial security tools which ask "how can we find every bug?" Instead, they focus on custom tooling that can build guardrails for developers. This doesn't require million-dollar tools, PhDs in program analysis, or days of compute time. It looks much more like unit tests for security.

We believe there is a gap between traditional compliance tools and simple linters that's ripe for a new approach, and we were fortunate to find partners from Redpoint Ventures and Sequoia Capital who agreed. With them, we raised a $13M Series A round of funding to build a security tool that developers might actually love. We've been working on it quietly for a while now, and we're finally ready to announce it to the world!

Semgrep

Semgrep, our open-source product, is specifically designed for eradicating bug classes. Developers and security engineers can say "this is the safe pattern we always use for (e.g. parsing XML)", write a rule in a few minutes, and enforce that on every editor save, commit, and pull request.

Semgrep is ideal for building security guardrails: start by using frameworks designed with security in mind, then automatically flag code that strays from the secure-by-default path. This is an approach used by Google, Facebook, Amazon, Dropbox, Stripe, Netflix, and others—a topic Clint Gibler and I presented on at Global AppSec 2020. This approach increases developer productivity, reduces attack surface, minimizes the areas for human inspection and audit, and allows the security team to scalably protect code written by thousands of developers.

The idea behind Semgrep is simple: it feels like a regular search (grep) but is syntax-aware. You can learn Semgrep in a few minutes! And Semgrep can be used for more than just security issues: performance, internationalization, or just annoyances committed by accident.

Semgrep pattern example

$ semgrep -e foo(1) matches all equivalent variations. See a live example of matching exec calls

What's Next?

Semgrep started as an open-source project at Facebook and we're lucky to have its original author, Yoann Padioleau, on our team at r2c. Since we released the first post-Facebook version (0.4) earlier this year, we've released 25 new versions, added support for 8 new languages, reworked the parsers so we could collaborate with Github on tree-sitter, been joined by thousands of enthusiastic GitHub followers, and seen over 100K pulls of the Semgrep Docker image.

Our roadmap contains more program analysis features to support the sorts of secure-by-default enforcement that large technology companies are already leveraging so heavily (constant propagation, taint tracking, and more), as well as support for many more languages.

Batteries Included

Along with this release of Semgrep, we're announcing the availability of Semgrep Community, a free, hosted service for managing Semgrep CI as well as Semgrep Teams, a paid service which adds additional features for managing Semgrep that are useful for enterprises. Both these offerrings provide SaaS infrastructure for operating a modern AppSec program. They enable central definition of code standards for your projects and show results where you already work: GitHub, GitLab, Slack, Jira, VS Code, and more.

We're also excited that Semgrep Registry already has 900+ rules written by r2c and the community—you can start running on your project right now! Or if you like to DIY, try writing your own.