Observability & Event Management Engineer (Spyglass)
Remote
Contract
Key Responsibilities
* Spyglass Event Configuration Management Primary Focus
* Design and implement Event Correlation Rules within Spyglass to aggregate disparate logs and metrics into meaningful business events
* Configure Event Policies Suppression Rules and Deduplication logic to reduce alert fatigue for L1L2 support teams
* Build and maintain Spyglass Dashboards that provide real-time visibility into the health of the FMCG application landscape Experience Engineering Mobile and Emerging Tech
* Integrate Spyglass with diverse data sources Azure Monitor GCP Cloud Logging AWS CloudWatch and ELK Falcon Log Scale
* Manage the lifecycle of an event from ingestion and normalization to correlation alerting and archival
* Proactive Monitoring SRE Integration
* Develop Threshold based and Anomaly detection alerts to identify performance degradation before it impacts the consumer experience
* Implement Service Level Objective SLO monitoring within Spyglass to track the reliability of critical business paths eg Order-to-Cash Checkout flows
* Automate the creation of incidents in ITSM tools eg ServiceNow based on Spyglass event triggers
* Build Situation Rooms or high level executive views within Spyglass for major promotional events or seasonal peaks
* Hyper scaler Platform Connectivity
* Ensure seamless event ingestion from Hyperscaler hosted microservices AKSGKE and serverless functions
* Support the monitoring of Integration Platforms Boomi APIM by capturing and correlating middleware execution events
* Collaborate with the Cloud Infrastructure team to monitor network latency and connectivity events across the global Azure GCP backbone
* Senior Developer Specifics
* Lead the architectural design of the Global Event Management Strategy
* Define enterprise standards for event metadata severity levels and correlation patterns
* Mentor junior developers in event driven observability and AIOps principles
* Conduct Post Mortem analysis of major incidents to refine Spyglass correlation rules and prevent recurrence
Primary Skills Must-Have
* Spyglass Event Management Expert level proficiency in Spyglass configuration rule writing and dashboarding
* Event Correlation Deep understanding of event patterns deduplication and suppression logic
* Observability Integration Hands-on experience connecting Spyglass to Azure Monitor GCP Stackdriver or ELK
* ITSM Integration Proven experience integrating event platforms with ServiceNow or similar tools
* Data Normalization Ability to parse and normalize logs events from heterogeneous sources JSON XML Syslog
Secondary Skills Good-to-Have
* AIOps Familiarity with machine learning based noise reduction and predictive alerting
* Scripting Proficiency in Python or JavaScript for custom event processing or API integrations
* Cloud Native Experience monitoring Kubernetes AKSGKE environments
* FMCG Domain Understanding of critical business hours and high traffic windows for retail sales applications.