Job Description:
We are seeking an experienced Service Delivery Expert to manage our service operations.
* Design and maintain on-call and coverage plans to ensure all critical skills are available when needed.
* Owning the incident management process for assigned accounts, prioritizing roles, communication cadence, escalations, and post-incident reviews.
Key Responsibilities:
1. Create and maintain SOPs (Standard Operating Procedures), runbooks, and triage guides for SRE (Site Reliability Engineering) teams covering common incident types and operational tasks.
2. 'Train first-line/SRE teams to confidently handle initial triage, basic troubleshooting, clear communication escalating only when necessary.' Continuously refine documentation based on real incident experience feedback.' Continuously refine documentation based on real incidents experience of our team.', 'To Refine data accordingly we will continuously require some internal knowledge from engineers. As per requirement you must be able both explain data analytics, cloud distributed systems related topic in Business terms & Tech expressions at a glance while being extensively familiar with customer configuration environment & infrastructure ensuring no gaps information through enhancements during incidents' together but overall expected as owner Service Delivery Strong technical writers maybe needed continuously whole team retain integrity :)