Site reliability engineer - cloud infrastructure specialist

Uberaba

beBee Careers

Anunciada dia 14 junho

Descrição

Site Reliability Engineering (SRE) Role

This position involves creating and maintaining scalable and highly reliable software systems. The role requires a combination of software engineering principles and infrastructure management expertise.

Main Responsibilities:

* Design and implement monitoring tools to ensure continuous system reliability and performance.
* Respond promptly to emergency situations impacting system reliability, perform root cause analysis, and implement corrective actions.
* Streamline change management processes to enhance system performance and reliability.
* Collaborate with development teams to identify and resolve system-related issues and automate routine tasks.
* Ensure the scalability and reliability of systems, meeting high performance and efficiency standards.

Key Skills and Qualifications:

* Proficiency in monitoring tools like Azure Monitoring, App Insights, Prometheus, Grafana.
* Experience with Infrastructure as Code (Terraform, ARM/Bicep, Pulumi, etc.) and release management tooling (ArgoCD, Harness, Octopus, etc.).
* Knowledge of incident alert tools (PageDuty, Opsgenie) and container orchestration tools like Kubernetes, AKS.

About the Job:

The ideal candidate will have a strong background in SRE principles, experience with DevOps practices, and excellent problem-solving skills.

Requirements:

* Strong understanding of SRE principles and best practices.
* Experience with cloud platforms (AWS, Azure, Google Cloud).
* Proficiency in programming languages like Python, Java, or C++.
* Familiarity with agile methodologies and version control systems like Git.

Benefits:

This is an exciting opportunity to work on challenging projects, collaborate with talented professionals, and grow your career in the field of Site Reliability Engineering.

Se candidatar

Criar um alerta

Salvar