---
name: oncall-rota-builder
description: Build a fair, documented on-call rota for a small IT team — with coverage pattern, primary/secondary, handoff protocol, compensation, bank-holiday rules, exception handling, and a published schedule.
version: 1.0.0
author: VantagePoint Networks
author_url: https://www.vpnetworks.co.uk
audience: IT Managers, Service Delivery Managers, MSP Operations Leads, SRE Managers, Solo Founders about to hire their second engineer
output_format: Formatted Markdown rota document with coverage pattern, rota table, handoff protocol, compensation model, exception procedures, and escalation path.
license: MIT
last-reviewed: 2026-04
---

# On-Call Rota Builder

A Claude Code skill for designing the on-call rota that every growing IT team eventually needs — before burnout drives people out, or before a single-person dependency creates a real crisis.

## How to use this skill

1. Download this `SKILL.md` file.
2. Place it in `~/.claude/commands/` (macOS/Linux) or `%USERPROFILE%\.claude\commands\` (Windows).
3. In Claude Code, run `/oncall-rota-builder`. Describe the team and service. Answer the clarifying questions. Receive the rota document.

## When to use this

- You've been the informal on-call for everyone and it's unsustainable.
- You hired your second or third engineer and need a structure.
- An MSP is onboarding a new client and the contract requires 24/7 coverage.
- A tribunal, insurer, or client asked "what's your out-of-hours support posture?"
- You've had a close call where nobody was reachable during an incident.

## What you'll get

A single Markdown document containing:

- **Coverage pattern** (who's on when, for how long)
- **Rota table** (named weeks ahead + holiday handling)
- **Primary / secondary / escalation** roles
- **Handoff protocol** (at start / end of shift)
- **Compensation model** (time off, allowance, overtime)
- **Bank-holiday and weekend rules**
- **Exception handling** (sickness, personal emergency, swap requests)
- **Escalation path** (when primary can't cope)
- **Rules for the on-call engineer** (response time, tooling, dress-for-work-from-home)
- **Review cadence** (monthly rota health-check)

## Clarifying questions I will ask you

1. **Team size (engineers available for rota)?**
2. **Coverage window required?** (24/7 / business hours extended / evenings + weekends only)
3. **Geographic spread?** (all UK / follow-the-sun with EU / US offices)
4. **Services in scope?** (which need immediate response vs. next-business-day)
5. **Expected alert volume per week?** (light, moderate, heavy)
6. **Existing tooling?** (PagerDuty, Opsgenie, phone-tree, nothing)
7. **Legal / contractual response-time commitments?**
8. **Budget for on-call compensation?** (allowance / hourly / TOIL / none)
9. **Any staff who can't be on call?** (caring responsibilities, medical, contractual)
10. **Bank-holiday expectation?** (full coverage / reduced / skip)
11. **Swap flexibility?** (strict rota / swaps allowed / tradeable)
12. **What's broken today?** (informal-only / one person carries it all / coverage gaps / response-time misses)

## Output template

```markdown
# On-Call Rota — <team> — <effective date>

**Owner:** <role> · **Tool:** <tool> · **Review:** monthly

## 1. Coverage Model
- **Coverage window:** <e.g. 24/7 / 08:00-22:00 weekdays + 10:00-18:00 weekends / business hours only>
- **Services in scope:** <list with priority>
- **Response targets:**
  - Acknowledgement: < 15 min (P1), < 30 min (P2)
  - Initial diagnosis / triage: < 1 hour
  - Engagement of secondary if needed: < 30 min from primary request
- **Out of scope:** <e.g. non-urgent tickets → next business day>

## 2. Roles
| Role | Responsibility |
|---|---|
| Primary on-call | First responder. Carries pager. Must be reachable and capable of acting within SLA. |
| Secondary on-call | Backup. Consulted when primary needs help or is unavailable. Also carries pager, quieter. |
| Escalation (Manager-on-call) | Called when primary + secondary both cannot resolve, or when business/customer decisions needed. |
| Subject-matter experts (SMEs) | Non-rota specialists who may be paged for specific issues (DBA, Network SME, Security). Informal/best-effort. |

## 3. Rota Table (next 12 weeks, indicative)
| Week | Primary | Secondary | Manager-on-call |
|---|---|---|---|
| W1 | Eng A | Eng B | Mgr |
| W2 | Eng B | Eng C | Mgr |
| W3 | Eng C | Eng D | Mgr |
| W4 | Eng D | Eng A | Mgr |
| W5 | Eng A | Eng B | Mgr |
| ... | | | |

**Rotation pattern:** weekly (Mon 09:00 to Mon 09:00 handoff).
**Maximum frequency:** 1-in-4 (never more than every 4th week for primary).
**Maximum consecutive weeks:** 1 on primary, optional 1 on secondary immediately after.

## 4. Handoff Protocol
Every shift ends and begins with a handoff. Template: see `/oncall-handoff-writer` skill output.

Handoff meeting:
- **Monday 09:00:** 15-min call between outgoing and incoming primary.
- Outgoing presents: open issues, recent changes, watch items, any silenced alerts, any scheduled changes.
- Incoming confirms: has pager access, runbook locations, contact list, no conflicts for the week.

## 5. Compensation Model
| Component | Value |
|---|---|
| On-call allowance (per week of primary) | £<N> |
| On-call allowance (per week of secondary) | £<N> |
| Paid call-out (actual work during out-of-hours) | hourly rate × 1.5, minimum <N> hours |
| TOIL (time off in lieu) — alternative | 1:1 hours for work done between 22:00 - 08:00 |
| Bank holiday on-call | double allowance |
| Shift swap approved | no financial adjustment; hours balance managed by rota owner |

## 6. Bank Holidays & Weekends
- Full coverage on all working days.
- Reduced-alert weekends (P1 only, P2 → next working day unless customer SLA differs).
- Bank-holiday weeks covered by the primary whose normal rotation hits that week.
- December / summer holidays: named swap-partners agreed 8 weeks in advance.

## 7. Rules for the Primary On-Call
- **Acknowledge every page** within target even if you're about to investigate — silence is bad signal.
- **Reachable means:** phone on, volume up, in a location with signal and charging available.
- **Work-from-home-capable:** laptop + power + network within reach.
- **If you consume alcohol or anything impairing, hand off to secondary immediately.** Not punishable; expected.
- **You don't have to solve every issue alone.** Escalate to secondary, SMEs, or escalation-manager.
- **Document everything in the incident channel.** Even if unsure. Audit trail matters.
- **End of shift, you hand off.** You don't keep carrying an open issue into your personal time.

## 8. Exception Handling
### Sickness during shift
- Notify rota owner + secondary immediately.
- Secondary covers until rota owner finds replacement or secondary becomes primary.
- Full compensation for the replacement / secondary taking over.

### Personal emergency
- Phone / text rota owner. No need for details.
- Same handoff as sickness.

### Swap request
- Both parties email rota owner ≥ 48 hours before (where possible).
- Rota owner confirms in writing.
- Swap registered in the tool and updated in the schedule.

### Opt-out
- Documented reasons: caring responsibilities, medical, contractual (e.g. part-time).
- Rota owner ensures opt-outs don't concentrate load on remaining staff; hiring discussion if it does.

## 9. Escalation Path
When primary is overwhelmed or stuck:
1. Secondary (already warm via handoff)
2. SME (if technical specific)
3. Manager-on-call (decisions, not technical)
4. External vendor TAM / MSP escalation line
5. Client sponsor (only for customer-sponsored incidents, typically P1+)

Each escalation step is documented in the incident channel so the trail is audit-ready.

## 10. Review Cadence
### Weekly
- Rota owner reviews: pages received, who was paged, any SLA misses, any swaps.

### Monthly (30-min meeting)
- Total pages / person.
- Distribution (fair?).
- Issues with handoffs.
- Swap frequency (high = pattern?).
- Upcoming holidays / gaps.
- Changes to rota or policy agreed here.

### Quarterly
- Compensation review (still appropriate for the load?).
- Tooling review.
- Team satisfaction sentiment.

### Annual
- Full rota policy review.
- Hiring needs based on load trends.

## 11. Tooling
- **Pager:** <PagerDuty / Opsgenie>.
- **Schedule management:** integrated with pager tool.
- **Handoff document:** stored in <location>, shared with incoming primary.
- **Incident documentation:** <Teams / Slack channel + incident-management tool>.
- **Runbook library:** <location>.

## 12. What Good Looks Like
- No engineer on call more than 1-in-4 weeks.
- Fewer than <N> pages per primary-week on average; fewer than <N> of those actionable after tuning (link to alert-triage-reviewer).
- Handoff meeting happens every Monday.
- Zero SLA misses in the quarter, or documented reasons.
- Team retention on the rota > 12 months.

## 13. What Bad Looks Like (and what to do)
- Same person carrying > 40% of pages → distribute via tuning or hiring.
- Handoff meetings skipped → rota owner enforces; it's not optional.
- Swap frequency > 50% of weeks → life-reasons or rota design problem — investigate.
- Regular out-of-hours work without compensation tracking → audit TOIL usage; enforce logging.
- People leaving citing on-call → urgent review: too often, too noisy, or not enough support.
```

## Example invocation

**User:** "5-engineer IT team at a 120-person London fintech. 24/7 coverage needed (FCA operational resilience). Currently: informal phone-tree, senior engineer carries most of it, people burning out."

**What the skill will do:**
1. Ask 12 questions, pressing on: who's currently opting out and why, the FCA resilience impact-tolerance for their IBSes (informs acknowledgement SLA), whether any engineers can't do on-call contractually.
2. Produce the rota with:
   - 1-in-4 primary rotation (with 5 engineers, one week on, three off)
   - Paired secondary (different engineer, quieter week)
   - Senior-engineer carrying manager-on-call role (not primary) — removes the burden they've had
   - Allowance model: £250/week primary, £100/week secondary, double for bank holidays
   - Specific Monday 09:00 handoff meeting protocol
   - 24/7 coverage commitments matched to FCA-level SLAs
3. Flag that hiring a 6th engineer within 6-9 months is likely necessary given the coverage + growth pattern.

## Notes for the requester

- **1-in-4 is the healthy maximum for sustained primary rotation.** More than that and attrition accelerates.
- **Never just "secondary is the same person who's also the backup DBA".** The rota must be distinct people, even if they're informally consulted as SMEs.
- **Compensation is not a bonus — it's payment for restricted time.** Treat it as a legal obligation, documented.
- **Rota fairness is a retention issue.** Pages-per-engineer over a quarter should be within ~20% of the mean. Outliers indicate the rota or alert hygiene needs work.
- **Handoff meeting is non-negotiable.** Skipping it leads to "wait, what was that alert from last Wednesday?" during an incident.
- **Good looks like:** engineers stay on the rota willingly, swaps happen for life-reasons not rota-reasons, and the rota scales without heroics when the team grows or shrinks.

---
*VantagePoint Networks · <https://www.vpnetworks.co.uk> · Authored by Hak · Free under the MIT licence*
