Open a terminal in any modern software project and count the dependencies. A typical Node.js application pulls in between 500 and 1,500 packages. A Python project with a handful of direct dependencies can resolve into hundreds of transitive ones. A Java application with 20 declared dependencies in its POM file might load 200 JARs at runtime. Every one of those packages is code you did not write, did not review, and are running with the same privileges as your own application.

That is the software supply chain problem. You are not just trusting the libraries you chose. You are trusting every library those libraries chose, and every library those libraries chose, all the way down. A single compromised package anywhere in that tree can exfiltrate secrets, install backdoors, or destroy data. And attackers know this.


The software supply chain problem

Software has always relied on shared components. What has changed is the scale, the speed, and the trust model. In the 1990s, a developer might include a handful of third-party libraries, each one carefully evaluated and downloaded from a vendor's website. Today, a single npm install or pip install command can pull hundreds of packages from a public registry in seconds, with no human review at any step.

Transitive dependencies and the trust tree

The most dangerous aspect of modern dependency management is transitivity. You explicitly choose to depend on Package A. Package A depends on Packages B, C, and D. Package B depends on Packages E and F. Package E depends on Packages G, H, and I. You evaluated Package A. You probably did not evaluate Packages B through I. But all of them are running in your application with full access to your environment variables, filesystem, and network.

The event-stream incident in 2018 demonstrated this perfectly. A widely-used npm package called event-stream had been handed off to a new maintainer who injected malicious code, not into event-stream itself, but into a new transitive dependency called flatmap-stream. The malicious code specifically targeted the Copay Bitcoin wallet application, stealing cryptocurrency private keys. Millions of downstream applications pulled in the compromised code through their dependency trees without knowing it existed.

Trust chains and package registries

Public package registries like npm, PyPI, Maven Central, and RubyGems operate on a fundamentally trust-based model. Anyone can create an account and publish a package. There is no mandatory code review, no security audit, and in most cases no identity verification beyond an email address. The registry's job is to host packages and resolve dependencies, not to verify that those packages are safe.

This means that when you run npm install express, you are trusting that the npm account that published the express package has not been compromised. You are trusting that every maintainer of every transitive dependency of express has not been compromised. You are trusting that the registry itself has not been compromised. And you are trusting that no one has inserted a package with a confusingly similar name into the registry to trick your build system into pulling the wrong thing.

Each of those trust assumptions has been violated in real-world attacks.


Notable supply chain attacks

Supply chain attacks are not theoretical. They have affected some of the largest and most security-conscious organizations in the world. Understanding how these attacks worked is essential for building defenses against them.

SolarWinds (2020)

The SolarWinds attack remains the most consequential software supply chain compromise to date. Attackers, attributed to Russian intelligence services, compromised the build system for SolarWinds' Orion IT monitoring platform. They inserted a backdoor into a routine software update that was then digitally signed by SolarWinds and distributed to approximately 18,000 customers, including the U.S. Treasury, the Department of Homeland Security, and major corporations.

The backdoor, known as SUNBURST, was sophisticated. It lay dormant for two weeks after installation before activating. It communicated with command-and-control servers using DNS queries disguised as legitimate Orion traffic. It checked for the presence of security tools before executing. And it was delivered through a legitimate, signed software update, meaning every security control designed to verify software authenticity treated it as trusted.

The lesson: Compromising the build pipeline means compromising every customer downstream. Digital signatures prove provenance, not safety. If the build system is compromised, the signature is legitimate but the code is not.

Codecov (2021)

Codecov, a code coverage tool used in CI/CD pipelines, had its Bash Uploader script modified by attackers. The modified script exfiltrated environment variables, including CI/CD secrets, API tokens, and credentials, from every build pipeline that used it. Because Codecov was integrated into CI/CD workflows with access to source code and deployment credentials, the blast radius was enormous. Hundreds of companies were affected, including Twitch, HashiCorp, and other technology firms.

The lesson: CI/CD pipelines are high-value targets because they have access to production secrets. Third-party tools integrated into build pipelines need the same security scrutiny as production dependencies.

ua-parser-js (2021)

The ua-parser-js npm package, downloaded over 7 million times per week, was compromised when an attacker gained access to the maintainer's npm account. Three malicious versions were published that installed a cryptocurrency miner and a password stealer on Linux and Windows systems. The attack lasted only a few hours before being detected, but the package's download volume meant millions of builds were potentially affected.

The lesson: A single compromised maintainer account can affect millions of downstream users. The most popular packages are the highest-value targets.

event-stream (2018)

A social engineering attack in slow motion. The original maintainer of the event-stream npm package, burned out and no longer interested in maintaining it, handed off ownership to a new contributor who had been helpfully submitting pull requests. The new maintainer added a dependency called flatmap-stream containing obfuscated malicious code that targeted the Copay Bitcoin wallet. The attack was discovered only when another developer noticed the obfuscated code and investigated.

The lesson: Maintainer succession is an attack vector. Social engineering can target not just users but open source maintainers, who are often overworked volunteers with no security training.

Log4Shell (2021)

CVE-2021-44228, known as Log4Shell, was not a supply chain attack in the traditional sense. It was a critical remote code execution vulnerability in Apache Log4j, a Java logging library used in virtually every Java application. The vulnerability allowed attackers to execute arbitrary code by simply causing the application to log a specially crafted string. Because Log4j was so ubiquitous, the vulnerability affected hundreds of millions of devices and applications worldwide.

The lesson: A vulnerability in a single deeply-embedded dependency can create a global security crisis. Most organizations could not even determine whether they were affected because they did not have an inventory of their transitive dependencies.


Attack vectors: how supply chains get compromised

Understanding the attack surface requires knowing the specific techniques attackers use to inject malicious code into supply chains. These are not theoretical; each one has been used in real attacks.

Typosquatting

Attackers publish packages with names that are slight misspellings of popular packages: coffe-script instead of coffee-script, crossenv instead of cross-env, lodahs instead of lodash. A developer who makes a typo in their package.json or requirements.txt installs the malicious package instead of the legitimate one. The malicious package typically includes the legitimate package's functionality (by depending on it) plus a payload that exfiltrates environment variables or installs a backdoor.

Typosquatting is alarmingly effective. Research has repeatedly shown that typosquatted packages on npm and PyPI receive thousands of downloads before being detected and removed. Some attackers register dozens or hundreds of typosquat variants for popular packages, casting a wide net.

Dependency confusion

Dependency confusion, also known as namespace confusion, exploits the way package managers resolve packages when both public and private registries are configured. Many organizations use internal package names for their private libraries. If an attacker publishes a package with the same name on the public registry with a higher version number, the package manager may prefer the public version over the private one.

Security researcher Alex Birsan demonstrated this in 2021, successfully injecting code into the build systems of Apple, Microsoft, and dozens of other companies by publishing public npm, PyPI, and RubyGems packages that matched the names of their internal packages. The package managers resolved the public packages because they had higher version numbers.

Compromised maintainer accounts

Many high-impact npm, PyPI, and RubyGems packages are maintained by individual developers using personal accounts with weak or reused passwords and no two-factor authentication. Compromising a single account gives the attacker the ability to publish new versions of every package that account controls. The ua-parser-js, coa, and rc incidents all involved compromised maintainer accounts.

The scale of this risk is staggering. Research has found that on npm, the top 20 most-depended-upon packages are controlled by accounts that, collectively, if compromised, would give an attacker access to over 50% of the entire npm ecosystem through transitive dependencies.

Malicious updates to legitimate packages

Even without compromising an account, attackers can gain publish access through social engineering, as in the event-stream case. They contribute helpful patches to build trust, then request and receive maintainer access. Alternatively, they may purchase or otherwise acquire abandoned packages that still have significant download numbers and inject malicious code into a new version.

This attack vector is particularly difficult to defend against because the malicious update comes from a legitimate source. The package name is correct, the publisher account is authorized, and the version number is higher than the previous release. Every automated system treats it as a routine update.


Registry-specific risks

Each package ecosystem has its own characteristics, security controls, and vulnerability patterns. Understanding the specific risks of the registries your organization depends on is important for building effective defenses.

Registry Ecosystem Key Risks Notable Controls
npm JavaScript / Node.js Massive dependency trees, install scripts execute arbitrary code, typosquatting rampant 2FA for maintainers, npm audit, provenance attestations
PyPI Python No namespace protection, setup.py executes during install, limited maintainer verification Trusted publishers, Sigstore-based attestations, mandatory 2FA for critical projects
Maven Central Java / JVM Deep transitive dependency chains, Log4Shell demonstrated ubiquity risk, GAV coordinate spoofing Namespace verification, PGP signatures required, Sonatype oversight
RubyGems Ruby Gem install hooks, maintainer account compromises, dependency confusion 2FA support, WebAuthn, gem signing
crates.io Rust Build scripts (build.rs) execute arbitrary code, proc macros run at compile time Immutable publishes, GitHub-linked accounts, cargo-audit
Go Modules Go Module proxy caching, vanity import paths, init() functions execute on import Checksum database (sum.golang.org), module proxy transparency

npm is the most targeted registry by volume because of the sheer size of the ecosystem and the aggressive use of transitive dependencies. A typical npm project's node_modules directory contains far more code than the application itself. Install scripts (preinstall, postinstall) execute arbitrary shell commands during npm install, giving malicious packages immediate code execution without the developer ever importing or running the package.

PyPI has historically had the weakest publisher verification of the major registries. Until recently, anyone could publish any package name without namespace protection. The setup.py execution model means that simply installing a package can execute arbitrary Python code. The Trusted Publishers initiative and Sigstore attestations are improving the situation, but adoption remains partial.

Maven Central has stronger publisher verification through namespace ownership, but the depth of transitive dependency chains in the Java ecosystem means that a vulnerability in a deeply-nested library can be extraordinarily difficult to identify and remediate, as Log4Shell demonstrated.


SBOM: knowing what you are running

A Software Bill of Materials (SBOM) is a formal, machine-readable inventory of every component in your software: every direct dependency, every transitive dependency, every version number, every license. It is the foundational document for supply chain security because you cannot secure what you cannot see.

When Log4Shell was disclosed, organizations with SBOMs could search their inventory and determine within hours whether they were affected. Organizations without SBOMs spent days or weeks manually auditing applications, Docker images, and embedded systems to find instances of Log4j. Some never found them all.

SBOM formats

Two formats dominate the SBOM landscape: SPDX (Software Package Data Exchange), maintained by the Linux Foundation, and CycloneDX, maintained by OWASP. Both are capable of representing dependency trees with component names, versions, licenses, and vulnerability information. CycloneDX has gained significant traction in the security community due to its purpose-built focus on security use cases, including vulnerability tracking and exploit reachability analysis.

Generating and maintaining SBOMs

SBOMs should be generated automatically as part of your CI/CD pipeline, not manually maintained. Tools like Syft, Trivy, and CycloneDX generators can produce SBOMs from source code, container images, and compiled binaries. The key is to generate SBOMs at build time and store them alongside the artifacts they describe, so you always know exactly what went into a specific release.

An SBOM is not a one-time artifact. It must be regenerated with every build and continuously monitored against vulnerability databases. A dependency that was safe when you built your application may have a critical vulnerability disclosed tomorrow. Without a current SBOM and continuous monitoring, you will not know until an attacker tells you.

U.S. Executive Order 14028 (May 2021) requires SBOM delivery for all software sold to the federal government. The EU Cyber Resilience Act imposes similar requirements for products sold in Europe. SBOM generation is no longer optional for organizations selling to regulated industries or government customers.


Dependency scanning tools and their limitations

Dependency scanning tools, such as Snyk, Dependabot, Renovate, Grype, and OWASP Dependency-Check, monitor your dependencies for known vulnerabilities by comparing package versions against databases like the National Vulnerability Database (NVD) and the GitHub Advisory Database. They are essential, but they have significant limitations that teams must understand.

What they catch

Dependency scanners excel at detecting known vulnerabilities in known packages. When a CVE is published for a package version you are using, the scanner flags it. This is valuable. Many organizations run dependencies with years-old critical vulnerabilities simply because no one is tracking them.

What they miss

Dependency scanning is a necessary layer of defense, not a sufficient one. It catches the low-hanging fruit and must be supplemented with the practices described in the following sections.


Lockfile integrity and reproducible builds

A lockfile (package-lock.json, yarn.lock, Pipfile.lock, Gemfile.lock, go.sum) pins every dependency in your tree to a specific version and, in most cases, a specific content hash. Lockfiles are the single most important supply chain security control that most teams already have but do not properly enforce.

Why lockfiles matter

Without a lockfile, your build system resolves dependencies at install time, potentially pulling newer versions that include malicious code. The event-stream attack was effective precisely because applications that ran npm install without a lockfile (or that did not pin to a specific version) automatically pulled the compromised version.

Content hashes are the critical feature. A lockfile that records only version numbers protects against version changes but not against a compromised registry serving different content for the same version number. Content hashes (integrity fields in package-lock.json, hashes in Pipfile.lock) ensure that the exact bytes you reviewed are the exact bytes that get installed. If the content changes, the hash will not match, and the install will fail.

Enforcing lockfile integrity

Reproducible builds

A reproducible build is one that produces byte-for-byte identical output when given the same source code and build environment. Reproducibility proves that the binary you are running was built from the source code you reviewed, with no modifications injected during the build process. This directly addresses the SolarWinds-style attack where the build pipeline itself was compromised.

Achieving full reproducibility is challenging, as build tools often embed timestamps, file ordering, or environment-specific paths into output. But even partial reproducibility, where builds produce consistent dependency resolution and deterministic compilation, significantly raises the bar for build pipeline attacks.


Vendoring vs. pinning vs. floating dependencies

Teams have three fundamental strategies for managing how dependencies are resolved, each with different trade-offs between security, maintainability, and operational overhead.

Floating dependencies

Floating dependencies use version ranges (^1.2.3, ~1.2.3, >=1.0.0) that allow the package manager to pull newer versions within the specified range. This is the default behavior for most ecosystems and the least secure option. A compromised patch release will be automatically pulled into your next build. The only defense is the lockfile, and if the lockfile is regenerated (as happens during routine dependency updates), the malicious version will be locked in.

Pinning dependencies

Pinning specifies exact version numbers (1.2.3 with no range operators) for every direct dependency. Combined with a lockfile, this prevents your build from pulling unexpected versions. However, pinning creates maintenance overhead: you must explicitly update every dependency version, which means security patches are not applied automatically. Teams that pin without a disciplined update process often end up running outdated dependencies with known vulnerabilities.

Vendoring dependencies

Vendoring copies the full source code of every dependency into your repository. Your build uses the vendored copy, never fetching from the public registry. This provides the strongest supply chain security because your build is entirely self-contained: a compromised registry, a deleted package, or a malicious update cannot affect you. The Go ecosystem has native vendoring support (go mod vendor), and tools exist for other ecosystems.

The trade-off is repository size and update complexity. A vendored node_modules directory can be hundreds of megabytes. Updating a vendored dependency requires re-vendoring and reviewing the changes. But for high-security applications, the isolation vendoring provides is worth the overhead.

Our recommendation: For most teams, the right approach is pinned direct dependencies, a committed and enforced lockfile with content hashes, and automated dependency update tools (Dependabot, Renovate) that create pull requests for review. Vendoring is appropriate for critical infrastructure, air-gapped environments, or applications where build reproducibility is a regulatory requirement.


Private registry security

Organizations that publish internal packages must secure their private registries with the same rigor they apply to production infrastructure. A compromised private registry is a direct path to compromising every application that depends on it.

Preventing dependency confusion

The dependency confusion attack works because package managers check public registries for packages that should only exist in private registries. Defenses include:

Access control and audit logging

Private registries should enforce strict access control: who can publish packages, who can read packages, and who can administer the registry itself. Publish access should require multi-factor authentication and be limited to CI/CD service accounts or designated maintainers. Every publish event, authentication attempt, and configuration change should be logged and monitored.


Open source maintainer risk assessment

Before adding a dependency, evaluate not just the code but the people and processes behind it. A technically excellent library maintained by a single, anonymous individual with no security practices is a higher risk than a less elegant library maintained by a well-resourced team with established security processes.

Factors to assess

This assessment does not need to be exhaustive for every dependency. Focus your evaluation effort on direct dependencies, dependencies that handle sensitive data (cryptography, authentication, data serialization), and dependencies that are deeply embedded in your application's critical paths.


Sigstore, SLSA, and software provenance

The next generation of supply chain security is built around the concept of provenance: cryptographic proof of where software came from, how it was built, and who was involved at each step. Two initiatives are driving this forward.

Sigstore

Sigstore is an open source project that provides free code signing for open source software. It solves a long-standing problem: traditional code signing requires developers to manage private keys, which are frequently compromised or lost. Sigstore uses short-lived certificates tied to developer identity (through OIDC providers like GitHub or Google), so there are no long-lived keys to protect.

Sigstore's components include Cosign for signing container images and blobs, Fulcio as a certificate authority that issues short-lived certificates, and Rekor as a transparency log that records all signing events. Together, they create an auditable chain of custody for software artifacts.

npm and PyPI have both integrated Sigstore-based provenance attestations, allowing package publishers to cryptographically prove that a specific package version was built from a specific source commit using a specific CI/CD workflow. This makes it significantly harder for an attacker who compromises a maintainer's account to publish a malicious version without detection: the provenance attestation would not match.

SLSA (Supply chain Levels for Software Artifacts)

SLSA (pronounced "salsa") is a security framework that defines four levels of supply chain integrity, from basic (Level 1) to comprehensive (Level 4). Each level adds requirements for build process documentation, hermetic builds, and provenance verification.

SLSA Level Requirements What It Prevents
Level 1 Build process is documented and produces provenance metadata Unknown build processes, missing audit trail
Level 2 Build service generates and signs provenance automatically, source is version-controlled Tampered provenance, unversioned source modifications
Level 3 Build platform is hardened, provenance is non-forgeable, builds are isolated Compromised build environments, cross-build contamination
Level 4 Hermetic builds, all dependencies declared, two-person review of source SolarWinds-style build pipeline compromises, insider threats

SLSA Level 3 is the practical target for most organizations. It requires that builds run on a hardened, hosted build platform (like GitHub Actions or Google Cloud Build), that provenance is generated automatically by the platform (not by the developer), and that the build process is isolated so that one build cannot influence another. This directly addresses the class of attacks where the build pipeline itself is the target.

SLSA Level 4 adds hermetic builds (no network access during build) and two-person review requirements, which are appropriate for critical infrastructure but operationally demanding for most teams.


Supply chain security checklist for engineering teams

This is a practical, prioritized checklist for engineering teams who want to improve their supply chain security posture. Items are ordered roughly by impact relative to implementation effort.

Foundation (implement immediately):

Commit lockfiles to version control and enforce them in CI/CD with deterministic install commands (npm ci, pip install --require-hashes).

Enable automated dependency scanning (Dependabot, Snyk, or Renovate) and triage alerts within defined SLAs.

Require multi-factor authentication on all package registry accounts, especially those with publish access.

Pin direct dependencies to exact versions. Use lockfile content hashes for integrity verification.

Review lockfile changes in every pull request with the same scrutiny as code changes.

Intermediate (implement within a quarter):

Generate SBOMs automatically in your CI/CD pipeline and store them with release artifacts.

Implement dependency confusion protections: scoped packages, registry configuration, reserved public names.

Evaluate direct dependencies using OpenSSF Scorecard and maintainer health metrics before adoption.

Audit install scripts (preinstall, postinstall) in npm packages. Consider using --ignore-scripts by default and explicitly allowing scripts for trusted packages.

Establish a process for responding to supply chain incidents: who gets notified, how are affected builds identified, what is the rollback procedure.

Advanced (implement as your program matures):

Verify Sigstore provenance attestations for critical dependencies.

Work toward SLSA Level 3 for your own build pipelines: hardened build service, automatic provenance generation, isolated builds.

Implement a private registry or caching proxy (Artifactory, Nexus, Verdaccio) to control what packages enter your build environment.

Vendor critical dependencies for air-gapped or high-security environments.

Conduct periodic manual security reviews of your most critical and least-maintained dependencies.


The organizational dimension

Supply chain security is not purely a technical problem. It is an organizational one. Someone needs to own the dependency inventory. Someone needs to triage vulnerability alerts. Someone needs to make the call on whether a new dependency is worth the risk it introduces.

In most organizations, nobody owns this. Dependency management falls into the gap between development teams (who add dependencies to ship features), security teams (who worry about vulnerabilities but do not control the dependency tree), and platform teams (who manage build infrastructure but do not evaluate package security). The result is that nobody is accountable for supply chain risk until an incident forces the question.

Effective supply chain security requires clear ownership: a team or role responsible for dependency governance, tooling, and incident response. This does not need to be a large team. It needs to be a defined responsibility with authority to set policy (for example, requiring OpenSSF Scorecard evaluation for new dependencies) and the tooling to enforce it.

The investment is small compared to the alternative. A single supply chain compromise can require emergency remediation across every application in your portfolio, customer notifications, regulatory reporting, and months of forensic investigation. Building the controls described in this article is a fraction of that cost.

Secure Your Software Supply Chain

We help engineering teams identify supply chain risks through dependency audits, secure code reviews, and CI/CD pipeline assessments. Find the vulnerabilities before attackers find them for you.

Book a Consultation CI/CD Security Guide
-- views
Link copied!
Lorikeet Security

Lorikeet Security Team

Penetration Testing & Cybersecurity Consulting

We've completed 170+ security engagements across web apps, APIs, cloud infrastructure, and AI-generated codebases. Everything we publish here comes from patterns we see in real client work.