Python static analysis comparison: Bandit vs Semgrep

by Grayson Hardaway and Clint Gibler on June 22, 2021

GitLab recently announced they are transitioning a majority GitLab SAST analyzers to Semgrep! This transition begins with the phasing out Bandit and ESLint (see their epic) in the GitLab 14.0 release in June 2021. As the maintainers of Semgrep, we want to compare Bandit and Semgrep to provide context for the switch.

This post covers:

  • Security coverage: What does each tool detect?
  • Custom rules: What do custom rules look like? And, following good engineering principles, how are they tested?
  • Performance: How fast is each tool?
  • Usage in CI/CD: How can each be run continuously?

For the curious, here’s a quick summary. More details are below!

Bandit Semgrep
Security coverage
Built-in Python security rules 68 166
Supported languages Python Python, Go, Java, JavaScript, and more
Performance
Baseline performance ~ 3313.99 sloc/sec ~ 2274.48 sloc/sec
Multithreading
Usage
CLI invocation > bandit -r . > semgrep -f <configuration>
Integration points: IDE VS Code, vim, emacs VS Code, IntelliJ, vim
Integration points: CI/CD GitHub Actions, available in GitLab SAST, CircleCI Orb GitHub Actions, GitLab CI/CD, sample configs for more
Custom rules
Rule language NodeVisitor in the ast module Looks like code + extra operators
Testing rules test_functional.py > semgrep --test

Security coverage

Bandit (v1.7.1) ships with 68 security checks for Python. Semgrep doesn’t ship with rules itself; it is instead an engine for scanning code. However, Semgrep has access to a community-maintained registry with over 1,000 rules for many different languages. As of this writing, the Semgrep registry has 166 security rules for Python, giving coverage similar to Bandit and a bit more. The Semgrep registry also serves groups of rules called rulesets, including two that have similar coverage to Bandit: p/bandit and p/gitlab-bandit, which is maintained by GitLab.

GitLab aims to match the original Bandit findings identically. Bandit and Semgrep using p/gitlab-bandit report mostly the same findings, with a small number of differences. For this comparison we used the Zulip repository and checked to see if the findings reported by Bandit and p/gitlab-bandit were the same. (For the curious, the definition of “the same” is when both the starting line number and file path match.)

Zulip repository > bandit -r . > semgrep -f p/gitlab-bandit
No. of findings 1019 1042
No. of findings unique to this tool 81 32
Findings unique to Bandit ('B101', 17), ('B105', 17), ('B106', 2), ('B108', 5), ('B603', 28), ('B607', 8), ('B610', 4)
Findings unique to Semgrep with p/gitlab-bandit ('gitlab.bandit.B610', 1), ('gitlab.bandit.B105', 6), ('gitlab.bandit.B110', 24), ('gitlab.bandit.B112', 1)

Let’s dive into a few differences, starting with B610 - django_extra_used. Bandit finds four instances of QuerySet.extra(...) that p/gitlab-bandit does not. Here is one of these instances; also copied below.

# https://github.com/zulip/zulip/blob/c1833b74f09b6295db4013a4215f9c01f3287af9/zerver/lib/users.py#L186
where_clause = "upper(zerver_userprofile.email::text) IN (SELECT upper(email) FROM unnest(%s) AS email)"
return query.select_related("realm").extra(where=[where_clause], params=(emails,))

The Semgrep rule expects there to be an objects property accessed as part of the chain. Bandit does not have this requirement, and therefore detects more instances of QuerySet.extra(...).

[... snipped ...]
- id: bandit.B610
  patterns:
  - pattern-either:
    - pattern: $X.objects.$FUNC(...).extra(...)
    - pattern: $X.objects.$FUNC(...).$FILTER(...).extra(...)
    - pattern: $X.objects.$FUNC(...).$FILTER(...).$UPDATE(...).extra(...)
[... snipped ...]

Interestingly, p/gitlab-bandit detects one instance of B610 that Bandit does not detect.

The good news is that it’s easy to extend Semgrep rules with additional definitions. See the Custom Rules section for more details, or visit the Semgrep docs.

Another example is B105 - hardcoded_password_string. Bandit detects lines such as VIDEO_ZOOM_CLIENT_SECRET = "client_secret" and BIG_BLUE_BUTTON_SECRET = "123" which p/gitlab-bandit does not detect. The rule uses metavariable-regex to detect the left-hand-side of an expression. However, the regex definition doesn’t detect the capitalized versions of “secret”, “token”, and others.

[... snipped ...]
- id: bandit.B105
  patterns:
    - pattern-either:
      - pattern: $MASK == "..."
      - pattern: $MASK = "..."
    - metavariable-regex:
        metavariable: "$MASK"
        regex: "[^\\[]*([Pp][Aa][Ss][Ss][Ww][Oo][Rr][Dd]|pass|passwd|pwd|secret|token|secrete)[^\\]]*"
[... snipped ...]

By contrast, p/gitlab-bandit detects lines such as secrets_path = "zproject/dev-secrets.conf". Depending on your objectives, this could be desired - p/gitlab-bandit can detect things that Bandit doesn’t. On the other hand, you may consider this a false positive.

Bandit benefits from its years as the primary security scanning tool for Python and tends to report more “accurate” results—Bandit reported more instances of QuerySet.extra(...) and did not report secrets_path. Semgrep benefits from the ability to rapidly make changes depending on your desired outcome—the missed instances above are easy to correct for with just a few extra lines. This rapid development is where Semgrep really shines. New rules are easy to write, and adjustments can be made quickly if there are any errors.

Custom rules

Bandit

Under the hood, Bandit uses a variant of the NodeVisitor paradigm exposed by Python’s ast module. Bandit rules are written with Python code using the Bandit API. To write a custom rule you can write a Bandit plugin. The API makes simple rules, such as checking for the presence of exec, easy to write. More complicated rules require understanding both the NodeVisitor paradigm and the data exposed by the Bandit API, such as this example which checks for jinja2 setups where automatic escaping is disabled. (This exposes apps to cross-site scripting (XSS) vulnerabilities.)

To learn more about writing Bandit plugins, check out this holistic article about securing your code with Bandit.

Bandit supports overriding settings for certain plugins via its configuration file. This lets you do things like select only the functions you want to flag while ignoring others.

Semgrep

Semgrep parses code, and search queries, into an internal AST representation. This means that Semgrep queries (henceforth called “patterns”) look similar to the code that will be matched. For example, to detect the presence of exec, the Semgrep pattern is exec(...). The ellipsis is a Semgrep construct; you can read more about the Semgrep syntax in the documentation.

More sophisticated rules are expressed in YAML file which composes multiple patterns together. Detecting jinja2 setups with disabled escaping in Semgrep can expressed in a YAML file like the one shown below. The rule uses the pattern: clause to find all jinja2.Environment(...) constructions and the two pattern-not: clauses to filter out safe constructions of jinja2.Environment(...).

rules:
  - id: autoescape-disabled
    languages: [python]
    message: Detected a Jinja2 environment without autoescaping. Jinja2 does not
      autoescape by default. This is dangerous if you are rendering to a browser
      because this allows for cross-site scripting (XSS) attacks. If you are in
      a web context, enable autoescaping by setting 'autoescape=True.' You may
      also consider using 'jinja2.select_autoescape()' to only enable automatic
      escaping for certain file extensions.
    patterns:
      - pattern-not: jinja2.Environment(..., autoescape=True, ...)
      - pattern-not: jinja2.Environment(..., autoescape=jinja2.select_autoescape(...), ...)
      - pattern: jinja2.Environment(...)
    severity: WARNING

Testing rules

Just like writing code without tests is ill advised, so is writing static analysis checks without tests to ensure they work as expected.

After all, you don’t want to think you’re finding and blocking certain bad code patterns, only to later learn your rule had some sort of subtle bug. Rule tests also provide valuable documentation, as they make it easy to quickly grok what code a rule is and isn’t supposed to flag.

Bandit

Bandit recommends creating test cases for rules in the examples/ directory, corresponding to the security rules in bandit/plugins directory.

test_functional.py ensures that Bandit finds the appropriate number of Low, Medium, and High severity findings in the relevant examples/ file. An advantage to this approach is that any sample file can be used so long as the correct counts are known. A disadvantage of this approach is that it does not allow for precisely matching test cases, nor does it allow for writing test cases that don’t match (negative test cases).

One can see # unsafe and # safe being used in xmletreecelementtree.py, # this is not safe and # this is safe in paramiko_injection.py, miscellaneous comments in other files, and no comments at all in os-exec.py and other example files. These can be used by a manual reviewer, if precision is needed.

Semgrep

Semgrep supports creating unit tests for each rule by defining test cases in source code (e.g., my-rule.py) for each corresponding Semgrep YAML file (e.g., my-rule.yml).

You can then test that your patterns match the intended code via running $ semgrep --test. It’s possible to annotate lines you expect to match or not match currently, as well as lines you plan to have match in the future (for example, after you improve a rule).

The following is an example from the docs. You can see many examples of rules and their unit tests at the official Semgrep rules GitHub repository: https://github.com/returntocorp/semgrep-rules.

Performance

This is a runtime test using a 2017 Macbook Pro (3.1 GHz Quad-Core Intel i7) on four repositories. The runtime was measured in wall-clock time for an entire invocation of the command. For a better comparison, Semgrep was run in single-threaded mode because, at the time of this writing, Bandit does not support multi-threaded scans.

Semgrep (v0.55.1) > semgrep -j 1 --json -f p/gitlab-bandit Bandit (v1.7.1) > bandit -f json -r .
Repository source lines of code (sloc) Runtime (sec) sloc/sec Runtime (sec) sloc/sec
Bandit 8270 16.65 496.85 3.43 2411.08
Flask 10155 16.50 615.60 3.08 3292.59
Zulip 326304 61.58 5298.69 67.71 4819.14
Django 296136 110.22 2686.77 108.35 2733.17
Average: 2274.48 Average: 3313.99

Bandit is much faster on smaller repositories. It seems that Semgrep does a fair amount of setup before scanning. However, on large repositories Bandit and Semgrep exhibit similar performance. If we run Semgrep with multithreading, scans are 2x faster on large repositories (at least on this hardware, where Semgrep uses 8 by default). However, Semgrep is still slower than Bandit on small repositories, likely due to some setup overhead.

As the Semgrep maintainers, this was an interesting finding for us! We are looking into speeding this up.

Semgrep (multithreaded): > semgrep --json -f p/gitlab-bandit
Repository sloc Runtime (sec) sloc/sec
Bandit 8270 18.65 443.43
Flask 10155 17.90 567.32
Zulip 326304 33.61 9707.96
Django 296136 53.28 5558.11
Average: 4069.20

Usage

Integrations during development

Both Bandit (docs) and Semgrep (docs) can be run with pre-commit.

Bandit appears to be bundled into VS Code’s Python linters and can be enabled by setting python.linting.banditEnabled. There are also Bandit plugins for vim and Emacs.

Semgrep has a VS Code extension, IntelliJ IDEA plugin, and a vim plugin. See the extension docs for more details.

Integrations during CI/CD

As CLI tools, Bandit and Semgrep can be easily inserted into any build system that supports running arbitrary CLI tools (read: nearly all of them).

Bandit has a GitLab analyzer, though it is being deprecated in favor of Semgrep, as well as community-contributed configs for other CI providers.

Bandit has several community-contributed GitHub Actions, that can selectively only scan the changed files, write PR comments, or upload the results as a build artifact, but it’s unclear if any of these Actions can perform all of this functionality.

Semgrep’s officially supported GitHub Action can be configured to scan only the changed files or do a full repo scan, write PR comments, block the build (or let it pass), and upload results to GitHub’s Advanced Security tab in SARIF format for review within GitHub.

Semgrep also has example configurations for other CI providers, including GitLab, Buildkite, CircleCI, Jenkins, and more (docs).

Ignoring lines of code

Both Bandit (docs) and Semgrep (docs) support ignoring a result on a specific line of code, using # nosec in Bandit or # nosemgrep in Semgrep.

Semgrep also supports ignoring only specific rules on a given line of code, using # nosemgrep: rule-id-1, rule-id-2. This is useful, as you don’t want to accidentally ignore a legitimate finding beyond the one you’re intending to ignore.

Ignoring paths

Bandit supports a .bandit configuration file (example) that exclude arbitrary file paths or rules from being run. Semgrep similarly supports path-based excludes in a .semgrepignore file (docs). These files can be checked in to a repository to take effect.

Ignoring rules

Inside Bandit’s config file, you can tell Bandit to only run certain checks, skip certain checks, or override the settings of checks which support it (docs).

Semgrep’s rule configuration file lists out the entire rule defintion. As such, you can mix-and-match rules inside a configuration file; if you want to disable a rule, you can save a configuration and remove the unwanted rules.

Other features

Semgrep is multilingual, supporting Python, JavaScript, Go, Ruby, and more, which means Semgrep can scan multi-language projects. Additionally, for any coverage that may be missing, Semgrep’s pattern syntax makes it easy to add new rules.

Semgrep understands certain language semantics. For example, in Python, Semgrep will resolve aliased imports to their original name. The pattern requests.get(...) will still match the code import requests as asdf; asdf.get(...). Other semantic features include detecting unordered keyword arguments (the order in which you write kwargs in a Semgrep pattern doesn’t matter) and constant propagation (which can determine if a literal value—a constant—has not been modified).

Semgrep sports a number of experimental features, one of which is autofix. While limited in functionality, Semgrep’s autofix enables simple expressions to be fixed.

Bandit reports both a severity and a confidence rating on each of its findings. Semgrep only reports a severity.

Summary

Bandit Semgrep
Security coverage
Built-in Python security rules 68 166
Supported languages Python Python, Go, Java, JavaScript, and more
Performance
Baseline performance ~ 3313.99 sloc/sec ~ 2274.48 sloc/sec
Multithreading
Usage
CLI invocation > bandit -r . > semgrep -f <configuration>
Integration points: IDE VS Code, vim, emacs VS Code, IntelliJ, vim
Integration points: CI/CD GitHub Actions, available in GitLab SAST, CircleCI Orb GitHub Actions, GitLab CI/CD, sample configs for more
Custom rules
Rule language NodeVisitor in the ast module Looks like code + extra operators
Testing rules test_functional.py > semgrep --test

Thanks for reading, we hope you found this helpful.

If there are any other aspects about the comparison that we should cover, or if we’re missing anything, please let us know!