How quickly are leaked API keys exploited after being committed to a public repo?

GitGuardian's research indicates that leaked secrets are often exploited within 4 minutes of being committed to a public repository. Automated scanners continuously monitor public GitHub commits and npm releases for high-entropy strings and known API key patterns. Once found, keys are tested programmatically against the relevant service. Any key with useful permissions is typically exploited before the developer notices the mistake, let alone has time to rotate it.

Does deleting a secret from a git repository remove it?

No. Deleting a file or removing a secret from code and committing that change does not remove it from git history. The original commit containing the secret remains in the repository's object store and is accessible to anyone who clones the repo. Secrets must be explicitly purged from history using tools like git-filter-repo or BFG Repo Cleaner, followed by force-pushing all branches and ensuring all collaborators re-clone or fetch the rewritten history. Even after purging, assume the secret has been seen — rotate it immediately.

What is the safest way to store secrets for a production application?

Secrets should never be stored in code, configuration files checked into version control, or CI/CD environment variables that appear in build logs. The recommended approach is a dedicated secrets manager — HashiCorp Vault, AWS Secrets Manager, Azure Key Vault, or GCP Secret Manager — where secrets are retrieved at runtime via short-lived tokens. Dynamic secrets (Vault generating unique, time-limited database credentials per application request) eliminate the concept of a long-lived credential entirely. Combine this with rotation policies, least-privilege IAM roles, and audit logging of every secret access.

What should I do immediately if I discover a secret was leaked in a public repository?

Rotate the secret immediately — this is the single most urgent action. Do not wait to investigate or assess scope first. Then audit cloud provider access logs (CloudTrail for AWS, Audit Logs for GCP) to identify any usage of the leaked key during the exposure window. Assume the key was accessed if it was public for more than a few minutes. Purge the secret from git history using git-filter-repo or BFG Repo Cleaner, and notify any affected services or providers. If the key had significant permissions and shows evidence of use, treat it as a confirmed breach and initiate incident response.

Secrets Sprawl: How Hardcoded Credentials and Leaked API Keys Become Breaches

TL;DR: GitHub found over 39 million secrets leaked in public repositories in 2023 alone. API keys, database passwords, OAuth tokens, private keys, and service account credentials are among the most common initial access vectors in cloud breaches — not because attackers are sophisticated, but because developers regularly commit secrets to version control and forget them. One leaked key with the right permissions can mean total cloud account compromise. The window between exposure and exploitation is often measured in minutes, not days.

Why Secrets End Up in Code

The path from a working credential to a committed secret almost never involves malice — it is almost always convenience colliding with habit. Understanding how it happens is the first step to stopping it.

The most common scenario: a developer needs to test an integration quickly, pastes an API key directly into a config file, gets it working, and commits the whole directory. The key was supposed to be temporary. It was not. .env files are the most frequent vector — a developer clones a repository, copies .env.example to .env, fills in real credentials, and then accidentally stages the file. If .env was never added to .gitignore, one git add . is all it takes.

Copy-paste from an existing working codebase into a new repository is another persistent source of sprawl. The original repo had the key in an environment variable. The new repo has it hardcoded. Nobody noticed during the port.

CI/CD pipelines introduce their own categories of exposure. Environment variables used in build steps appear in plaintext in build logs when debug modes are enabled — GitHub Actions' ACTIONS_STEP_DEBUG=true flag, Jenkins console output, CircleCI build logs. A developer enabling verbose logging to debug a failing build and then forgetting to disable it can expose every secret injected into that pipeline to anyone with log access.

Docker images are a particularly underappreciated source of credential exposure. Secrets passed as build arguments (ARG API_KEY) are baked into image layers even if the final image does not reference them directly. The layer cache stores the value and it can be extracted from any image built with --build-arg. Images pushed to public registries with embedded secrets are a reliable find in any external assessment.

Beyond code, secrets sprawl into communication channels: database connection strings pasted into Teams messages for a colleague to use, AWS access keys shared in a support ticket, API credentials in a screenshot attached to a Jira comment. Every channel that stores message history is a potential secrets repository.

Stack traces and verbose error messages complete the picture. A misconfigured application that exposes a full exception trace in a browser or in logs can leak database connection strings, including credentials, in plain sight. This is not hypothetical — it is a routine finding in web application penetration tests.

Where Secrets Hide: A Tour of the Attack Surface

When a penetration tester or attacker performs a secrets-focused review, they work through a predictable set of locations. Each has distinct characteristics that affect both how secrets end up there and how they are found.

Git History

This is the most important location to understand, because it defies the intuition that deleting something removes it. Git stores every version of every file ever committed. When a developer commits a secret, realizes the mistake, and immediately commits a deletion, the secret is still present in the repository's object store at the original commit hash. Cloning the repository gives an attacker full access to the entire history. Tools like TruffleHog scan every commit in a repository's history — not just the current HEAD — specifically to find this class of exposure.

The practical implication: if a secret was ever committed to a repository, assume it has been seen. Rotation is mandatory regardless of how quickly the deletion commit followed.

.env Files and Configuration

The .env file pattern is so common it has become its own attack category. Beyond .env, secrets accumulate in config.yml, application.properties, appsettings.json, database.yml, secrets.yml, and framework-specific configuration files. These are frequently committed with real credentials when developers work against production systems directly, or when staging environments use production keys for convenience.

CI/CD Pipeline Logs and Environment Variables

Build logs from GitHub Actions, Jenkins, CircleCI, and GitLab CI are a consistent source of credential exposure. When a build step echoes its environment for debugging — env in a shell script, printenv in a Makefile, or automatic secret masking that fails due to multi-line values — secrets appear in the log output. If build logs are accessible to all repository contributors (the GitHub default for public repositories), the exposure radius extends to everyone with read access.

Cloud function environment variables present a similar risk: AWS Lambda, Google Cloud Functions, and Azure Functions display environment variables in their console interfaces. A developer who inadvertently grants excessive IAM permissions to a function's execution role, or who uses the console to inspect a function's configuration, can expose secrets to anyone with access to the cloud console.

Docker Image Layers

Docker build arguments passed via --build-arg are stored in image layer metadata and recoverable with docker history --no-trunc. Secrets embedded in RUN commands during the build — even if the file containing the secret is removed in a subsequent layer — remain in the intermediate layer. Multi-stage builds mitigate this, but only when used correctly. Images pushed to Docker Hub, GitHub Container Registry, or other public registries with embedded secrets are scanned by automated tooling continuously.

NPM Packages and PyPI

Published packages are a frequently overlooked exposure path. Developers who accidentally include their .env file, personal .npmrc with auth tokens, or local configuration in an npm or PyPI package have published secrets to the entire public registry. Automated scanners monitor new package publications for high-entropy strings and known secret patterns. This is not a theoretical attack — it is a documented, repeated occurrence.

Real Incidents: What Secrets Sprawl Actually Costs

The consequences of credential exposure are not abstract. These incidents illustrate what happens when secrets reach public repositories or unauthorized parties.

Toyota Japan (2023): Five Years of Exposure

Toyota's Japanese subsidiary left AWS credentials in a public GitHub repository for nearly five years. The credentials provided access to a cloud environment containing data from Toyota's T-Connect telematics service. The exposure potentially affected 2.15 million customer records, including vehicle identification numbers and email addresses. The credentials were not discovered by Toyota — they were identified by external researchers. Five years of public exposure represents the worst case of the "we'll fix it later" mindset applied to a committed secret.

Samsung (2023): Internal Source Code and AWS Keys

Samsung experienced a significant data breach when internal source code, including code for Samsung's Galaxy devices, was leaked publicly. The exposure included AWS credentials embedded in the source code. Beyond the immediate impact of the leaked credentials, the source code exposure provided a roadmap for identifying other vulnerabilities — internal tools, proprietary algorithms, and security mechanisms that were never intended to be public.

Uber (2022): Keys in a Slack Message

The 2022 Uber breach began not with a sophisticated exploit but with social engineering. An attacker obtained an Uber contractor's credentials through a phishing attack, then found AWS access keys in internal Slack messages. Those keys provided access to Uber's internal systems, including an admin panel for HackerOne, where the attacker accessed confidential vulnerability reports submitted by security researchers. The breach demonstrated that secrets sprawl is not limited to version control — communication platforms with stored message history represent the same risk.

Twitch (2021): Misconfiguration to Source Code Exposure

A misconfigured server at Twitch led to the exposure of approximately 125GB of internal data, including the platform's source code, internal security tools, creator payout information, and proprietary SDKs. Source code exposure is categorically different from data exposure: it reveals the architecture, authentication mechanisms, and internal APIs of an application, providing a detailed attack map to anyone who obtains it. When that source code contains hardcoded credentials — as is common in internal tooling — the exposure compounds.

What Attackers Do with Leaked Secrets

The exploitation of leaked credentials follows a predictable pattern that has been well-documented across incident reports. Understanding it clarifies why rotation speed matters so much.

The four-minute window: GitGuardian's research found that leaked secrets are often exploited within 4 minutes of a public commit. Automated scanners continuously monitor GitHub's public event stream, npm publish events, and PyPI releases for high-entropy strings and patterns matching known API key formats. The scan, test, and exploitation cycle is fully automated — human involvement happens after a valid key is confirmed, not before.

The specific impact depends entirely on what the key can access:

AWS access keys: Attackers first enumerate permissions using sts:GetCallerIdentity to identify the account and then attempt IAM privilege escalation. From a developer's key with broad permissions, common paths include creating new IAM users or roles with administrator access, accessing S3 buckets for data exfiltration, spinning up EC2 or Lambda resources for cryptomining, accessing RDS databases, and reading Secrets Manager or Parameter Store values — which often contain additional credentials.
Database credentials: Direct data exfiltration. Depending on the database user's permissions, attackers can dump entire schemas, read PII and payment data, and in some configurations write or modify records.
Stripe API keys: Secret keys provide access to customer payment data, refund capabilities, and the ability to create charges. Restricted keys reduce the scope, but full secret keys represent direct financial and PCI exposure.
Twilio and SendGrid keys: These enable sending communications from the organization's verified sending identity. Attackers use them to send phishing campaigns at scale, leveraging the organization's trusted sending reputation to bypass spam filters. The reputational damage from having your domain associated with phishing outlasts the credential exposure itself.
GitHub personal access tokens: Access to private repositories, the ability to read code and secrets in those repositories, modify CI/CD configurations, and in some cases deploy code — enabling a pivot to production systems.

Detection: Finding Secrets Before Attackers Do

Detection operates at two points: prevention before secrets enter version control, and discovery of secrets that have already been committed.

Pre-Commit Hooks

Pre-commit hooks run before a commit is finalized, scanning staged changes for patterns that match known secret formats. Three tools dominate this space:

gitleaks scans git repositories and staged changes for secrets using a rule set covering hundreds of known API key patterns. It runs as a pre-commit hook, in CI/CD pipelines, or as a standalone scanner against the full repository history.
detect-secrets (Yelp) takes a baseline approach: it generates a baseline of known secrets (with hashes, not values) and flags new additions. Useful for repositories that already contain some false positives or approved exceptions.
git-secrets (AWS) specifically targets AWS credential patterns and is straightforward to integrate. Less comprehensive than gitleaks for non-AWS secrets.

The limitation of pre-commit hooks is that they are developer-side controls — they can be skipped with git commit --no-verify. They should be treated as a convenience layer, not a security boundary. CI/CD pipeline scanning provides the enforcement layer.

GitHub Secret Scanning

GitHub's native secret scanning automatically identifies patterns matching known API key formats across repository contents and commit history. When a match is found, GitHub alerts repository administrators and, for supported providers (AWS, Stripe, GitHub, and many others), directly notifies the provider to revoke or flag the exposed credential. Enabling secret scanning on all repositories — including private ones, where it requires a license — is a baseline control that costs nothing beyond the license and provides significant coverage.

TruffleHog

TruffleHog is the most thorough tool for historical secret scanning. It works by scanning every commit in a repository's history, not just the current state, using both pattern matching and entropy analysis to identify high-entropy strings that may be secrets even without matching a known format. Running TruffleHog against a repository before open-sourcing it, before a security review, or as part of an acquisition due diligence process provides the most complete picture of historical exposure.

TruffleHog can also scan S3 buckets, Confluence, Jira, Slack exports, GitHub Actions logs, and Docker images — making it useful beyond pure git history scanning.

SAST and Container Scanning

Static analysis tools like Semgrep catch hardcoded credential patterns in source code during code review and CI/CD pipelines. Rules targeting common patterns — password = "...", inline connection strings, base64-encoded credentials — provide a layer of coverage that complements entropy-based tools.

Container image scanning tools including Trivy and Grype analyze Docker image layers for embedded secrets as part of the image build and publish pipeline. Integrating these into a registry scanning policy — rejecting images that contain known secret patterns before they reach production or public registries — closes the Docker-specific exposure path.

Tool	Primary Use	Integration Point	Covers History
gitleaks	Secret pattern scanning	Pre-commit hook, CI/CD	Yes
detect-secrets	Baseline + new-secret detection	Pre-commit hook, CI/CD	No (current state)
TruffleHog	Full history + entropy scanning	CI/CD, standalone audit	Yes (all commits)
GitHub Secret Scanning	Known provider key patterns	GitHub native (push events)	Yes (on enable)
Semgrep	SAST pattern matching	CI/CD, IDE	No (current state)
Trivy / Grype	Container image layer scanning	Registry, CI/CD	All image layers

Secrets Management: The Correct Architecture

Detection tools address the symptom. The root cause is storing secrets anywhere other than a dedicated secrets manager. The correct architecture removes secrets from code, configuration files, and environment variables entirely — secrets are retrieved at runtime by the application using short-lived tokens scoped to the specific permissions needed.

Dedicated Secrets Managers

The major cloud providers and HashiCorp all offer secrets management solutions with similar core capabilities:

HashiCorp Vault: The most feature-complete option, supporting dynamic secrets, multiple secret engines (AWS, databases, PKI, SSH), fine-grained policies, and audit logging. Vault's dynamic secrets feature generates unique, short-lived credentials per application request — a database user that exists for 30 minutes, used once, and expires. There is no long-lived credential to leak.
AWS Secrets Manager: Native integration with AWS IAM for access control, automatic rotation for supported services (RDS, Redshift, DocumentDB), and CloudTrail audit logging of every secret access. Applications running in AWS use IAM roles to retrieve secrets — the application never stores credentials itself.
Azure Key Vault: Equivalent capability for Azure workloads, with Managed Identity integration eliminating credential management for Azure-hosted applications.
GCP Secret Manager: Tight integration with GCP's IAM and audit logging infrastructure, with workload identity federation for non-GCP workloads.

Rotation Policies

Secrets managers are most effective when combined with aggressive rotation policies. Long-lived credentials that are never rotated have an indefinite window of exploitation if compromised. Automatic rotation — where the secrets manager generates a new credential and updates the stored value on a schedule — reduces this window without requiring manual intervention. For credentials that cannot be automatically rotated, manual rotation cadences (quarterly at minimum) combined with monitoring for anomalous usage provide a compensating control.

Least Privilege Service Accounts

The impact of a leaked credential is directly proportional to its permissions. A developer's personal AWS access key with administrator permissions represents a catastrophic breach if exposed. A service account key scoped to read-only access on a single S3 bucket represents a much more contained incident. Applying least privilege to every service account, IAM role, and API key — granting exactly the permissions required for the specific function and nothing more — limits the blast radius of any single credential exposure.

Incident Response for Leaked Secrets

When a secret is confirmed or suspected to have been exposed, the response sequence matters as much as the response itself. Speed on the critical actions is more important than a complete investigation before acting.

Critical sequence: Rotate immediately — then investigate. Do not audit access logs, assess scope, or convene a meeting before rotating. The key may be actively exploited during any delay. Rotation is the only action that stops an ongoing compromise. Everything else happens afterward.

Rotate immediately. Generate a new credential and invalidate the exposed one. For AWS access keys, this means deactivating the key in IAM. For API keys, use the provider's revocation mechanism. Do this before any other step.
Audit access logs. Check cloud provider audit logs — AWS CloudTrail, GCP Audit Logs, Azure Activity Log — for any usage of the leaked credential during the exposure window. Look for API calls, resource accesses, and any IAM modifications that may have created persistent access (new users, roles, or access keys created using the compromised credential).
Assume breach if exposed publicly for more than a few minutes. Given the four-minute exploitation window, any secret that was publicly visible should be treated as compromised. Auditing will confirm or contradict this, but the working assumption drives a more appropriate response urgency.
Purge from git history. Deleting a file does not remove it from git history. Use git-filter-repo (the current recommended tool) or BFG Repo Cleaner to rewrite history and remove the secret from all commits. Force-push all affected branches and tags. Notify collaborators that they must re-clone or fetch the rewritten history — any local clone retains the old history until updated.
Notify affected services. If the leaked credential belongs to a third-party service, notify the provider. For credentials that affect customer data, assess notification obligations under applicable regulations (GDPR, state breach notification laws).
Review for persistence. If audit logs show the credential was accessed, investigate whether the attacker created any persistent access mechanisms before the credential was rotated — additional IAM users, roles, Lambda functions, EC2 instances, OAuth applications, or webhooks that may remain active after the original credential is rotated.

How Code Review Engagements Find Secrets

Penetration testing and secure code review engagements that include secrets detection go beyond running automated tools against the current codebase. The approach mirrors what a motivated attacker would do, and it surfaces exposure that automated scanning frequently misses.

A thorough secrets-focused assessment scans the full git history using TruffleHog with entropy analysis enabled, not just the current HEAD. It reviews CI/CD pipeline configurations — GitHub Actions workflows, Jenkinsfiles, .circleci/config.yml — for secrets injected as environment variables and for patterns that may cause secrets to appear in build logs. It examines Docker-related files (Dockerfile, docker-compose.yml, .dockerignore) for build arguments that embed secrets in image layers. It reviews configuration files across the codebase for connection strings, inline credentials, and default passwords that may have been overlooked. It checks NPM and Python package configurations to verify that published packages do not include files containing secrets.

Beyond the codebase, code review engagements examine cloud configuration: IAM policies for over-privileged service accounts, Secrets Manager usage (or lack of it), CloudTrail enablement, and whether rotation is configured for credentials that support it. The combination of automated scanning and manual review catches what neither approach catches alone — the obfuscated credential, the base64-encoded connection string, the environment variable echoed in a startup script.

If your organization is preparing for a security review, compliance audit, or simply wants to understand its secrets exposure before an attacker does, a source code review engagement provides a structured, comprehensive assessment of where credentials live across your codebase and infrastructure.

Find Your Secrets Before Attackers Do

Lorikeet Security's source code review and penetration testing engagements include systematic secrets detection — finding hardcoded credentials, leaked API keys, and misconfigured CI/CD pipelines before they become breach entry points. Book a consultation to discuss your exposure.

Book a Consultation View All Services

-- views

Link copied!

Lorikeet Security Team

Penetration Testing & Cybersecurity Consulting

Lorikeet Security helps modern engineering teams ship safer software. Our work spans web applications, APIs, cloud infrastructure, and AI-generated codebases — and everything we publish here comes from patterns we see in real client engagements.