---
name: secrets-rotation-plan
description: Build a practical plan to find and rotate long-lived secrets across an SMB — API keys, service-account credentials, SSH keys, database passwords, certificates, and shared passwords — with discovery, prioritisation, phased rotation, and ongoing hygiene controls.
version: 1.0.0
author: VantagePoint Networks
author_url: https://www.vpnetworks.co.uk
audience: IT Managers, DevOps / Platform Engineers, Security Leads, MSPs handling secrets hygiene for SMB clients
output_format: Formatted Markdown plan with discovery script, secrets register, prioritisation matrix, phased rotation schedule, vaulting recommendation, and leaver/change-of-role procedures.
license: MIT
last-reviewed: 2026-04
---

# Secrets Rotation Plan

A Claude Code skill to address the one problem every growing SMB has and nobody logs a ticket for: "we've never rotated our secrets" — quietly fixed over a quarter without breaking production.

## How to use this skill

1. Download this `SKILL.md` file.
2. Place it in `~/.claude/commands/` (macOS/Linux) or `%USERPROFILE%\.claude\commands\` (Windows).
3. In Claude Code, run `/secrets-rotation-plan`. Describe the environment. Answer the clarifying questions. Receive a plan.

## When to use this

- You've just discovered a 5-year-old API key embedded in a script.
- A senior engineer has left and nobody is sure what they knew.
- You're preparing for ISO 27001, SOC 2, or Cyber Essentials Plus and secrets-management is the weakest control.
- An incident exposed that credentials were shared in a team chat.
- You're about to introduce a secrets vault and need a migration plan.

## What you'll get

A single Markdown document containing:

- **Secrets inventory / discovery procedure** (places to look, commands to run, what to record)
- A **secrets register schema** (where each secret lives, what it unlocks, who owns it)
- **Prioritisation matrix** (exposure × blast radius × age)
- **Phased rotation schedule** over 12 weeks
- **Rotation procedure** per secret class
- **Vaulting recommendation** (1Password / Bitwarden / HashiCorp Vault / cloud-native)
- **Runtime injection patterns** (stop hard-coding)
- **Leaver / role-change procedure** (what to rotate when someone leaves)
- **Ongoing hygiene controls** (expiry dates, alerts, audit cadence)

## Clarifying questions I will ask you

1. **Organisation size and engineering footprint?** (codebase count, cloud accounts, SaaS estate)
2. **Cloud providers in use?** (AWS, Azure, GCP, multi, on-prem)
3. **Source-control platform?** (GitHub, GitLab, Bitbucket, Azure DevOps)
4. **Secrets manager already in place?** (none, password manager for humans, dedicated vault)
5. **CI/CD tool(s)?** (GitHub Actions, GitLab CI, Jenkins, Azure Pipelines)
6. **Any incidents involving secrets in the last 12 months?**
7. **Recent high-profile leavers?**
8. **Known places secrets live today?** (.env files, 1Password shared vault, team chats, a developer's laptop, config files)
9. **Development / staging / production separation clarity?** (clear / muddy)
10. **Known third-party integrations with long-lived API keys?** (payment, email, monitoring, data warehouse)
11. **Any compliance driver?** (SOC 2, ISO 27001, Cyber Essentials Plus, client contract)
12. **Budget / tolerance for downtime during rotation?**

## Output template

```markdown
# Secrets Rotation Plan — <organisation>

**Owner:** <role> · **Start date:** <date> · **Target completion:** <+12 weeks>

## 1. Executive Summary
<One paragraph: why rotation now, estimated count of secrets in scope, top 3 risks, target end-state (vaulted, rotated, short-lived where possible).>

## 2. Discovery Procedure
### 2.1 Code scan (source-control)
Run the following against every active repository in scope:
- **gitleaks** — open-source, fast, detects common secret patterns
- **trufflehog** — deeper historical scan (includes git history)
- Platform-native: GitHub Advanced Security secret scanning, GitLab secret detection
- Record: repo, commit, file, line, pattern detected, current status (active / rotated / false-positive)

### 2.2 Infrastructure scan
- Cloud consoles: list all IAM users with access keys; list all service-account keys; list all certificates
- SSH: inventory `~/.ssh/authorized_keys` on every server (config-management audit or one-off)
- Database: list all user accounts on production databases; flag any shared application accounts
- TLS / Cert: every certificate with expiry, issuer, subject, and where installed

### 2.3 Human-held
- 1Password / Bitwarden shared vaults: audit every item, confirm owner, last-modified date
- Team chats (Teams / Slack): search for "password", "key", "token" — rotate anything found
- Personal password managers storing shared credentials — these count as exposed

### 2.4 SaaS / third-party
- For every SaaS in the app register: where is the API key / OAuth credential stored; rotation cadence; owner
- Payment, email, data-warehouse integrations are highest priority

## 3. Secrets Register Schema
Every secret has a row:

| ID | Name | Class | Unlocks | Created | Last rotated | Owner | Stored where | Exposure | Priority |
|---|---|---|---|---|---|---|---|---|---|
| S-001 | Stripe live API key | API key | Payment processing | 2021 | never | CTO | AWS Secrets Manager | Low | High |
| S-002 | Sendgrid API key | API key | Transactional email | 2022 | 2024 | Eng Lead | Hard-coded in one service | High | Critical |
| ... | | | | | | | | | |

## 4. Prioritisation Matrix
**Priority = Exposure × Blast Radius × Age**.

| Exposure | Definition |
|---|---|
| Critical | In public repo / past team member / in chat / on unmanaged laptop |
| High | In private repo / in .env files not in vault / shared in person |
| Medium | In vault but accessed by > 5 people, or shared across environments |
| Low | In vault, accessed by ≤ 3 people, no recent suspicious access |

| Blast Radius | Definition |
|---|---|
| Critical | Payment, production database, cloud-account admin, customer data |
| High | Production app, auth provider, sensitive SaaS |
| Medium | Staging, monitoring, non-critical SaaS |
| Low | Dev environment only, throwaway |

| Age | Definition |
|---|---|
| > 24 months | — | Rotate regardless |
| 12-24 months | — | Rotate in plan |
| < 12 months | — | Rotate at scheduled review |

**Rotate immediately:** any secret with Critical Exposure OR Critical Blast Radius AND no clean rotation history.

## 5. Phased Rotation Schedule (12 weeks)
| Week | Focus | Count | Method |
|---|---|---|---|
| 1 | Discovery + baseline | All | Run scans, build register |
| 2 | **Critical exposure rotation** | Top 10 | Rotate immediately, adopt vault for each |
| 3 | **Payment & customer-data credentials** | All | Rotate, enable MFA on vendor console |
| 4 | **Production database secrets** | All | Coordinate with dev; move to runtime injection |
| 5 | **Cloud IAM long-lived keys → roles/OIDC** | All | Replace with short-lived where possible (e.g. GitHub Actions OIDC) |
| 6 | **SSH keys** | All production hosts | Replace shared keys with per-user; enforce via config management |
| 7 | **CI/CD secrets migration** | All | Move from repo env to CI secret store; enable masking |
| 8 | **Third-party SaaS API keys** | All | Rotate; document rotation cadence going forward |
| 9 | **Certificates approaching expiry** | All | Renew early; automate via ACME where applicable |
| 10 | **Human-held shared passwords** | All | Migrate to SSO or vaulted per-person; eliminate where possible |
| 11 | **Sweep: re-scan** | — | Repeat week 1 discovery — should be near-zero new findings |
| 12 | **Steady-state handover** | — | Document cadence, ownership, controls; train team |

## 6. Rotation Procedure per Class
### API keys (generic)
1. Generate new key in vendor console.
2. Store new key in vault; update runtime injection.
3. Deploy / restart consumers — verify healthy.
4. Revoke old key in vendor console.
5. Record in register: rotated date, rotator, verification evidence.

### Cloud IAM long-lived keys
Preferred: **eliminate**. Replace with short-lived credentials via:
- IAM Roles / AWS STS assume-role
- GitHub Actions / GitLab OIDC to cloud trust
- Azure Workload Identity / Google Workload Identity Federation

If long-lived key must stay, rotate every 90 days with automation.

### SSH keys
1. Generate new per-user Ed25519 keys.
2. Deploy via config management (Ansible, Chef, Salt).
3. Remove shared keys.
4. Rotate host keys if history is uncertain.

### Database credentials
1. Create new credentials.
2. Test against read replica / staging.
3. Deploy new credentials to runtime; grace period for long-running connections.
4. Revoke old.

### Certificates
Move to automated renewal (ACME / cert-manager) wherever possible. Alert on < 30-day expiry.

### Shared passwords
Eliminate where possible — enable SSO, per-user accounts, MFA. Where shared remains, vault it with a per-person pass-through.

## 7. Vaulting Recommendation
For <org size and complexity>:
- **Humans (passwords, occasional API keys):** <1Password Business / Bitwarden Teams / vendor-integrated>
- **Services (runtime):** <AWS Secrets Manager / Azure Key Vault / HashiCorp Vault>
- **CI/CD secrets:** Native (GitHub Actions encrypted secrets, GitLab CI variables) with masking
- **Certificate management:** ACME (Let's Encrypt) with cert-manager / win-acme for internal PKI

Cost estimate: £<N>/month for the vault stack at current scale.

## 8. Runtime Injection Patterns (stop hard-coding)
- `.env` files committed: eliminate. Local dev uses `.env.local` in .gitignore + a `.env.example`.
- Production: inject via vault SDK or environment variable populated from vault at startup.
- CI/CD: secret store masking; never echo secrets in logs.
- Configuration files with secrets: use placeholders + init-time substitution from vault.

## 9. Leaver / Role-Change Procedure
When someone leaves or moves role:
- [ ] Revoke their SSO access (day 0, hour 0)
- [ ] Revoke their vault access
- [ ] Rotate any shared secrets they had access to, highest-exposure first
- [ ] Rotate any personal secrets they generated that remain in production (API keys tied to their account, SSH keys, tokens)
- [ ] Review any automation they owned; reassign ownership
- [ ] Remove from git-commit-signing allow-lists, CI/CD approver lists
- [ ] Disable and subsequently delete their account per retention policy

For involuntary terminations: all above executed within 1 hour of termination.

## 10. Ongoing Hygiene Controls
- Every secret has an owner and an expiry / review date in the register.
- Alerts: 30 / 14 / 7 days before any secret expiry (certificates, vendor keys, database credentials).
- Quarterly register review.
- Annual penetration-test scope includes secrets discovery.
- CI/CD gates: block PRs with detected secret patterns.
- Training: developer induction includes "where secrets live, where they don't".

## 11. Metrics
| Metric | Baseline | Target at week 12 | Ongoing target |
|---|---|---|---|
| Total secrets in register | unknown | <N> | All known |
| Secrets > 12 months old | unknown | 0 | 0 |
| Secrets in public repos | unknown (found <N>) | 0 | 0 |
| Secrets stored in vault | <N>% | 100% | 100% |
| CI/CD with OIDC / short-lived creds | <N>% | >70% | >90% |
| Certificates auto-renewing | <N>% | >80% | >95% |
| Leaver credential-rotation SLA met | <N>% | 100% | 100% |

## 12. Risks & Mitigations
| Risk | Mitigation |
|---|---|
| Rotation breaks production | Staging dry-run, grace period, rollback plan per rotation |
| Rotation missed on critical secret | Register with owner + expiry; automated alert |
| New secrets created off-register | CI/CD gate + quarterly re-scan |
| Vault becomes a single point of failure | High-availability deployment; break-glass recovery procedure |
| Staff resistance to changing workflow | Show, don't tell; pair-programming for first few rotations |
```

## Example invocation

**User:** "20-person London fintech. AWS single account, GitHub, a dozen SaaS integrations (Stripe, Sendgrid, Segment, Datadog, etc.). Just discovered our Sendgrid key has been in our main repo's README for 18 months. Nobody rotated anything this year."

**What the skill will do:**
1. Ask the 12 questions, drilling on: exactly which repos are public / internal, what CI uses the secrets, whether AWS is using IAM roles or long-lived access keys.
2. Produce the plan with **week 2 rotating the Sendgrid key immediately** (public-repo exposure), **week 3** covering Stripe + customer-data credentials, **week 5** moving GitHub Actions to AWS OIDC to eliminate long-lived cloud keys.
3. Recommend 1Password Business for humans + AWS Secrets Manager for services — modest budget fit for a 20-person fintech.
4. Flag that the Sendgrid key should be treated as potentially compromised — check recent emails sent for anomalies, rotate, review Sendgrid activity logs, consider 72-hour ICO notification if PII-bearing emails were sent from that key unexpectedly.

## Notes for the requester

- **Start with discovery, not rotation.** Rotating one loud secret doesn't help if ten quiet ones are still exposed.
- **Replace, don't rotate, where you can.** Long-lived API keys should become short-lived tokens via OIDC / workload identity wherever the vendor supports it. Rotation cadence for short-lived tokens: minutes, not months.
- **Plan the break.** Every rotation of a production secret has a risk of downtime. Dry-run in staging, schedule a window, have rollback ready.
- **Leaver process is the biggest gap** in most SMBs. A manual HR-to-IT handoff misses secrets that aren't in a central register.
- **Public-repo exposure = treat as compromised.** Rotate immediately and investigate whether the key was used by anyone but you.
- **Good looks like:** 12 weeks in, every secret is in the register, none are older than 12 months, CI/CD uses OIDC, no shared passwords outside the vault, and your ISO 27001 / SOC 2 control owner signs off without pushback.

---
*VantagePoint Networks · <https://www.vpnetworks.co.uk> · Authored by Hak · Free under the MIT licence*
