Expert in IT Operations and Automation
We are seeking a proactive problem solver with a passion for delivering high-quality results, driving operational excellence, and fostering continuous improvement in our organization.
* Owning the daily health, performance, and availability of enterprise monitoring tools such as Splunk, DynaTrace, and NewRelic is key to this role.
* Maintenance, upgrades, and configuration tuning ensure optimal system performance.
* Triage and resolve monitoring-related incidents and service tickets in a timely and efficient manner.
* Collaborate with application, infrastructure, and DevOps teams to integrate monitoring solutions and improve visibility across the board.
* Develop and maintain dashboards, alerts, and reports to support operational and business needs effectively.
* Participate in on-call rotations and support incident response efforts when needed.
* Document operational procedures, runbooks, and knowledge base articles for future reference.
* Identify and implement automation opportunities to reduce manual effort and enhance reliability.
This role demands strong understanding of IT operations, incident management, and ticketing systems like ServiceNow and Jira. Proficiency in scripting languages (e.g., Python, PowerShell, Bash) for automation and tool integration is essential.