About this role: We are looking for an experienced Service Delivery Manager to oversee our service operations. This includes managing SLAs and incident processes, designing on-call and skills coverage, creating SOPs and first-line/SRE enablement, configuration management, SLA metrics and reporting, as well as coordination between customers and our engineering teams.
Responsibilities:
* Service operations: Design and maintain an on-call and coverage plan that ensures all critical skills are available when needed. Own the incident management process for your accounts: priorities, roles, communication cadence, escalations, post-incident reviews.
* SOPs runbooks & first-line enablement: Create SOPs runbooks triage guides for SRE engineers covering common incident types operational tasks. Train coach first-line/SRE teams so they can confidently handle initial triage basic troubleshooting clear communication escalating only when needed.
* Configuration management & readiness: Establish a configuration management process that keeps track of each customer's environment platforms in use clusters regions configs access monitoring key contacts. Proactively close information gaps by working directly with customers engineers ensuring configuration information is available trustworthy during incidents onboarding new engineers.
This is a hands-on role not a pure governance role you will be close to real incidents engineers customers bringing practices used successfully previously service managed-services environments.