---
name: cutover-day-runbook
description: Produce the cutover-day operational runbook for a migration — hour-by-hour plan, go/no-go gates, comms cadence, rollback triggers, and a minute-by-minute first-60-minutes script.
version: 1.0.0
author: VantagePoint Networks
author_url: https://www.vpnetworks.co.uk
audience: IT Managers, Migration Leads, Project Managers, MSP Technical Consultants, Platform Engineers
output_format: Formatted Markdown cutover runbook with pre-cutover checks, hour-by-hour plan, named roles, go/no-go gates, comms cadence, rollback plan, and post-cutover verification.
license: MIT
last-reviewed: 2026-04
---

# Cutover-Day Runbook

A Claude Code skill for the day you're actually flipping the switch — not the migration plan you wrote months ago, but the hour-by-hour runbook the on-the-night team holds in their hand.

## How to use this skill

1. Download this `SKILL.md` file.
2. Place it in `~/.claude/commands/` (macOS/Linux) or `%USERPROFILE%\.claude\commands\` (Windows).
3. In Claude Code, run `/cutover-day-runbook`. Describe the cutover in plain English. Answer the clarifying questions. Receive a runbook ready for the bridge.

## When to use this

- You're cutting over a system this weekend and the migration-planner output is too high-level for the team on the bridge at 2am.
- A vendor is running the technical work and you need to brief your side's observers and rollback decision-makers.
- You want a single shareable document for the runbook, comms plan, and rollback plan — no Jira-ticket hunting during the change window.
- Your CAB record references a detailed runbook and you need to attach one.
- You want pre-cutover-day dress-rehearsal steps and day-of go/no-go criteria on one page.

## What you'll get

A single Markdown document containing:

- **Pre-cutover checklist** (T-7 days through T-0)
- **Named roles** (lead, comms, observer, business sponsor, rollback approver, on-call escalation)
- **Bridge details** (call link, backup channel, start time, expected duration)
- **Go/no-go decision points** with explicit pass criteria
- **Hour-by-hour plan** (T-0 start through T+end)
- **First-60-minutes minute-by-minute script**
- **Comms cadence** (internal, stakeholder, external customer, social/status page)
- **Rollback triggers + rollback plan**
- **Point of no return** (explicit timestamp)
- **Post-cutover verification checklist** (0-30 min / 30 min-2 h / 2-24 h)
- **Known risks and who decides**

## Clarifying questions I will ask you

1. **What's cutting over?** (system, scope, source → target in one sentence)
2. **When?** (date, start time, timezone, expected duration)
3. **What business service depends on this?** (who feels it if something goes wrong)
4. **Is there user-facing downtime?** (yes — how long / no — but degraded for)
5. **Vendor or internal execution?**
6. **Who's on the bridge?** (roles, not names if you don't know yet)
7. **What's the go/no-go criteria at T-0?**
8. **What's your point of no return?** (after which you can't roll back)
9. **Has this been rehearsed?** (full dress rehearsal / partial / never)
10. **Primary rollback approach?** (restore snapshot, failover to standby, DNS flip back, replace hardware, revert config)
11. **Who's the named rollback approver?** (one person — not a committee)
12. **Business communications required?** (none, internal-only, external customer, regulator)

## Output template

```markdown
# Cutover Runbook — <system> — <date>

**Change reference:** CHG-<N> · **Cutover window:** <start> – <end> <TZ>
**Bridge:** <link> · **Backup:** <channel / phone>
**Lead:** <role> · **Rollback approver:** <role>

## 1. Cutover Summary
<One paragraph in business language. What changes for whom, when.>

## 2. Roles on the bridge
| Role | Responsibility | Who (name) |
|---|---|---|
| Change Lead | Runs the bridge, calls go/no-go | |
| Technical Lead | Executes the steps | |
| Observer | Independent verification | |
| Comms Lead | Drafts and sends updates | |
| Business Sponsor | Authorises scope/scheduling changes | |
| Rollback Approver | Single decision-maker on rollback | |
| On-Call Escalation | Reachable if rollback approver is not | |

## 3. Pre-Cutover Checklist
### T-7 days
- [ ] Runbook reviewed and approved by <role>
- [ ] Dress rehearsal completed — issues logged, remediated
- [ ] Snapshot / backup confirmed of all affected systems
- [ ] Rollback tested end-to-end

### T-48 hours
- [ ] Change notice sent to stakeholders (template in §10)
- [ ] Monitoring alerts silenced for the window (if appropriate)
- [ ] Out-of-hours contacts confirmed for each role
- [ ] Bridge link and backup channel tested

### T-24 hours
- [ ] Go/no-go meeting held and decision recorded
- [ ] Final backup confirmed running and verified
- [ ] Known dependencies checked (DNS TTLs lowered if needed, pending jobs paused)

### T-1 hour
- [ ] Bridge open
- [ ] Roll-call of all named roles
- [ ] Final weather check (system health, change-freeze status, external alerts)

## 4. Go / No-Go Gate (T-0)
All must be ✅ to proceed:
- [ ] All roles present on bridge
- [ ] Backup / snapshot timestamp < <N> hours old
- [ ] Source system healthy (no active incident)
- [ ] Target system healthy (no active incident)
- [ ] No external dependency in degraded state
- [ ] Comms pre-approved message staged
- [ ] Rollback approver confirmed reachable until <T+N>

**Decision:** GO / NO-GO (recorded by: ______)

## 5. Hour-by-Hour Plan
| Time | Action | Owner | Expected duration | Verification |
|---|---|---|---|---|
| T-0:00 | Comms: "Cutover starting" | Comms | — | Channel updated |
| T-0:05 | <step 1> | Tech | <min> | <check> |
| T-0:15 | <step 2> | Tech | <min> | <check> |
| T-0:30 | Go/no-go gate 2 (checkpoint) | Lead | — | — |
| T-0:45 | <step 3> | Tech | <min> | <check> |
| T-1:00 | Comms: progress update | Comms | — | — |
| T-1:30 | <step 4> | Tech | <min> | <check> |
| T-2:00 | Point of no return: rollback no longer possible via <mechanism> | Lead | — | Decision recorded |
| T-2:30 | <final step> | Tech | <min> | <check> |
| T-3:00 | Verification phase (see §8) | Observer | <min> | Sign-off |
| T-3:30 | Comms: "Cutover complete — in verification" | Comms | — | — |
| T-4:00 | Hand-back to operations / close bridge | Lead | — | Bridge closed |

## 6. First-60-Minutes Script (the one you'll read from)
**T+0:00** — Lead says: "Cutover is starting at <time>. I'm <name>, Change Lead. Acknowledge please:" [roll-call]. Comms: post "Cutover has started" to <channel>.

**T+0:02** — Tech executes <first command>. Observer confirms: <what they're watching>. Lead asks for "expected / unexpected" verdict.

**T+0:05** — <next step> …

(continues per-minute for the first hour — the most error-prone window)

## 7. Point of No Return
Defined as: **<specific event — e.g. DNS cutover propagated, source database taken read-only, new URL activated>**.
Expected at: **T+<N>:<NN>**.
After this: rollback is no longer clean. Proceeding assumes forward-fix only.

## 8. Verification Checklist (post-cutover)
### 0-30 minutes
- [ ] <key functional test 1> — expected result
- [ ] <key functional test 2> — expected result
- [ ] <authentication test> — expected result
- [ ] <monitoring signal> — within normal range
- [ ] <dependent system> healthy

### 30 minutes – 2 hours
- [ ] <end-to-end user journey> — executed and logged
- [ ] <integration with external service> — confirmed
- [ ] Performance within <N>% of baseline
- [ ] No unusual error rate in logs

### 2-24 hours (next-morning owner: <role>)
- [ ] Overnight batch / scheduled job ran successfully
- [ ] First-thing-Monday user logins succeed
- [ ] Morning monitoring review shows stable state

## 9. Rollback Triggers & Plan
Rollback is invoked if ANY of these occur before the point of no return:
- <specific failure 1>
- <specific failure 2>
- <verification step> fails and cannot be remediated within <N> minutes
- Business Sponsor requests rollback

### Rollback plan (tested <date>, estimated duration <N> min)
1. Comms: post "Rolling back — target restore time <T+N>" to <channel>
2. <rollback step 1> — owner, duration
3. <rollback step 2> — owner, duration
4. <rollback step 3> — owner, duration
5. Verify source system healthy (§8 checks against source baseline)
6. Comms: "Rollback complete. Service on <source system> as normal."
7. Post-rollback review scheduled within <N> working days.

### After the point of no return
Rollback not available. Failure modes mitigated by forward-fix. On-call escalation: <role>. SLA for forward-fix: <target>.

## 10. Comms Cadence
| Time | Audience | Channel | Message |
|---|---|---|---|
| T-48h | All affected | Email | "Planned work on <system>…" |
| T-4h | Service Desk | Ticket note | "Heads up — cutover tonight" |
| T-0 | Internal | Teams/Slack | "Cutover started" |
| T+0:30, +1:00, +2:00 | Internal | Teams/Slack | Progress update (even if nothing new) |
| Status-page / customers | External | Status page | Per §11 templates |
| T+complete | All | Email | "Cutover complete" |
| T+24h | Leadership | Email | Post-cutover summary with metrics |

## 11. Message Templates
### Internal — cutover started
> "Cutover of <system> has started at <time>. Expected completion <time>. Bridge: <link>. Updates every 30 minutes."

### External — if customer-facing
> "Planned maintenance on <service> is in progress. Some users may experience <impact>. ETA <time>. We'll update this page every 30 minutes. Apologies for any inconvenience."

### Rollback
> "We've taken the decision to roll back tonight's change to ensure stability. <Service> remains on the previous configuration. No action needed from users. Next attempt will be rescheduled."

### Completion
> "Cutover of <system> is complete at <time>. Service is now running on <new configuration>. Please <any action user needs to take>. Any issues: <contact>."

## 12. Known Risks & Who Decides
| Risk | If it happens | Owner | Escalate to |
|---|---|---|---|
| <risk> | <impact> | <role> | <role> |
| <risk> | <impact> | <role> | <role> |

## 13. Post-Cutover (T+24h)
- [ ] Post-cutover review scheduled
- [ ] Monitoring alerts re-enabled
- [ ] Runbook updated with actuals (deviations from plan)
- [ ] CAB closed with PIR (Post-Implementation Review)
- [ ] Lessons learned captured in <register>
```

## Example invocation

**User:** "Moving a 200-mailbox tenant from Microsoft 365 E1 to Business Premium Saturday 11pm, expected 4 hours. Vendor is doing the licence migration, we're on the bridge verifying mail flow and Teams. One partner is nervous because last attempt failed."

**What the skill will do:**
1. Ask the 12 clarifying questions, pressing on: rollback mechanism (can you revert the licence allocation?), point of no return (when do legacy features become unavailable?), last-time failure mode.
2. Produce the runbook with hour-by-hour plan, first-60-minutes script focused on the first-mail-send verification and Teams-presence checks, a rollback trigger specifically tied to "mail flow not restored by T+1:30", and explicit comms to the nervous partner at T+1 and T+2.
3. Flag that the point of no return is usually "E1 licences fully released" — after which, rollback means buying E1 again. Suggest holding release until verification is complete.

## Notes for the requester

- **Write this document for the people on the bridge at 2am, not for the PM who commissioned it.** Short sentences. Explicit times. Named owners.
- **The first 60 minutes are the most error-prone.** Spend disproportionate detail there.
- **A rollback plan nobody rehearsed is a plan that doesn't exist.** Rehearsal output goes in §9.
- **Point of no return must be specific.** "Once we've cut DNS" is not specific. "At T+1:45 when DNS TTL of 300s has fully propagated as verified by <dig query>" is.
- **Good looks like:** the bridge starts, follows the runbook minute-by-minute, and the Change Lead never has to type anything longer than a two-line update. The document did the heavy lifting.

---
*VantagePoint Networks · <https://www.vpnetworks.co.uk> · Authored by Hak · Free under the MIT licence*
