Service Operations Expert
We provide expert support and consulting for open-source analytics and data infrastructure platforms, including Apache Druid, Apache Flink, StarRocks, and other emerging technologies. Our customers run mission-critical systems that rely on us to keep them fast, stable, and available. We're a small team working across multiple time zones (US, Brazil, Europe, India), supporting 100+ customer environments with SLAs ranging from advisory support to 24/7 incident coverage.
About the Role: We're looking for an experienced Service Operations Manager to oversee our service operations:
* SLAs and incident processes
* on-call and skills coverage
* SOPs and first-line/SRE enablement
* configuration management
* SLA metrics and reporting
* and coordination between customers and our engineering teams.
This is a hands-on role where you'll be close to real incidents, engineers, and customers.
Service Operations:
ilDesign
Maintain an on-call plan that ensures all critical skills are available when needed.
Maintain key service metrics (e.g., MTTA,
MTTR,
SLA compliance,
backlog health) & drive improvements based on them.