Tagged "SRE"

SLA, Error Budget, Uptime: Where Do Maintenance Windows Fit?

When it comes to service reliability, maintenance windows are a frequent source of ambiguity. Whether you’re defining uptime, setting SLOs, or communicating with customers, it’s essential to be explicit about how scheduled (and unscheduled) maintenance is handled. Here’s a deeper dive with actionable recommendations for SREs, engineering managers, and anyone responsible for reliability targets.
SRE SLA SLO Error Budget

Reflecting on my talk at the London SRE Meetup

I recently presented “Working Together SRE and Platform Engineering” at the London SRE Meetup. The session focused on how SRE and platform engineering complement each other while SRE sharpens reliability through automation, incident response, and error-budget driven decisions, platform engineering improves developer experience with self‑service, standardisation, and efficient delivery pipelines. The presentation focused on the synergy between site reliability engineering and platform engineering to enhance operational efficiencies and reliability.
SRE Platform Engineering Presentation

How to Write an Architecture Decision Record (ADR)

How to Write an Architecture Decision Record (ADR) That Your Team Will Actually Use. If you’ve ever debated a tech choice in Slack at 4 p.m. and re‑debated it at 4 a.m., you know why ADRs exist. An Architecture Decision Record is a short, durable note that captures a significant choice, the context that shaped it, and the consequences that follow. Done well, ADRs prevent decision drift, align teams, and make future you very grateful.
SRE

← See all tags