Benchmarking Semgrep Community Edition Performance Improvements

Semgrep Community Edition (v1.124) boosts performance with a 3x speed enhancement in some use cases.

Jayson DeLancey

Ben Kettle

With each release, we ship new improvements to source-code scanning features, language coverage, framework support, integration options, and better security rules. We've had over twenty releases this calendar year, and the latest update (v1.124) includes performance improvements that can make Semgrep up to 3x faster than before under specific conditions.

We prioritize speed, not only as a general feature, but as a prerequisite to achieving our mission. If static analysis slows down development, teams won't adopt it. If it isn't adopted, vulnerabilities don't get caught. That's what is important and why Semgrep is the preferred solution for security researchers, pentesters, consultants, application security engineers, open-source developers, and hobbyists alike.

This post dives into the (modest) performance improvement with additional context on which users will benefit. Semgrep Community Edition is licensed under the open-source LGPL 2.1. We appreciate not only customer feedback but open-source community contributions as well so that all can benefit.

Semgrep CE Releases a 3X Performance Improvement

Recent community reports suggested that Semgrep had room for improvement in rule-loading speed–particularly when using many small rule files. This prompted a closer look.

How Rule Loading Works with Semgrep

Rules define the semantic patterns for identifying security vulnerabilities. To provide customization for all types of users, there are multiple methods to pull in rules.

Single rule files: --config=/path/to/ruleset.yml
Multiple rule files: --config=/path/to/ruleset1.yml –config=/path/to/ruleset2.yml
Directory of rules: --config=/path/to/rule-dir/
Fetched from the registry: --config=auto
Curated from a rule set: --config=RULESET_ID

Each method has different performance characteristics when loading rules.We test across these permutations, often prioritizing the most common strategies, while ensuring generalizability.

Thanks to community feedback, we confirmed a bottleneck. Rule validation was single-threaded and I/O bound when processing a directory of many rules. We recommend using --config=auto which downloads a single file so this edge case doesn't exhibit itself very often. While the usage pattern is uncommon, those who do could see startup time reductions of up to 90 seconds. For small repositories, that can be a sizable gain (3x on average) and may help enable quick iterations during local development.

Semgrep Community Edition Performance Comparison

Sensitivity to repository size

How We Benchmark Performance

We measure impact of changes across all our benchmarks (false positive rate, true positive rate, scan time, etc.). We include a large number of repositories to avoid any mistakes with oversampling or small N (typically around 70,000 repositories).

Multiple languages and frameworks
Varying file counts and repo sizes
Typical CI environments

While this latest performance improvement only affects an edge case exposed by smaller repositories, we believe it's a net win for the ecosystem. Faster scans that benefit some users is still a win.

What's Next: Experimental and Managed Scans

We’ve been working on multiple performance improvements across many different use cases. If you have performance needs, talk to us. We put changes through beta testing before we release them to all users and will on occasion make improvements available early through the --experimental flag. This allows early adopters to test out new behaviors or optimizations to ensure there are no regressions. We have some insanely fast stuff cooking but when running over 200 MILLION scans in a year we want to be deliberate.

For teams using Semgrep Managed Scanning, performance is not just about startup time. It’s about horizontal scalability. Managed Scans now support wide-scale parallelization across services, enabling continuous analysis across monorepos and complex repositories without degrading responsiveness.

We’re continuing to invest in scan performance wins across:

Our goal: security coverage without compromise.

We’re grateful to the users who report slowdowns, submit patches, or give feedback. Performance is a shared effort. We’re proud to keep improving it—together.

For full details on recent changes, see the Semgrep 1.124 release notes. To learn more about our history of performance improvements, take a look at blog posts Security Scanning at Ludicrous Speed and Need for Speed: Static Analysis to learn more about how we evaluate performance trade-offs.

Benchmarking Semgrep Community Edition Performance Improvements

Share

Semgrep CE Releases a 3X Performance Improvement

How Rule Loading Works with Semgrep

How We Benchmark Performance

What's Next: Experimental and Managed Scans

About

Featured posts from the Semgrep blog, written by our engineering team

From idea to (secure) app: Semgrep + Replit

Take control of sensitive code without developer frustration

Announcing an AI AppSec engineer that users agree with 95% of the time

Find and fix the issues that matter before build time