Job Opportunity: Senior DevOps Engineer
We are seeking a highly skilled Senior DevOps Engineer to join our team.
This role involves actively collaborating with software developers, product managers, test engineers, and administrators on projects to design and develop the build, release, and deploy toolchain for DevOps while providing on-call support.
The ideal candidate should be able to identify, troubleshoot and resolve issues quickly and effectively, sometimes under pressure. Key responsibilities include capacity planning, high availability engineering, performance tuning, and automation/tools development.
* Lead the design and management of highly available, scalable Kubernetes infrastructure using Amazon EKS
* Implement and manage infrastructure using Infrastructure as Code (IaC) with Terraform and Terragrunt
* Champion a Git-first approach to infrastructure and CI/CD automation
* Build and maintain robust CI/CD pipelines using Harness to streamline application delivery across environments
* Drive infrastructure automation and environment consistency across development, staging, and production
* Design and implement monitoring and observability solutions using New Relic, ensuring performance visibility and uptime
* Securely manage secrets using AWS Secrets Manager
* Support production systems through incident management, root cause analysis, and proactive reliability improvements
* Collaborate with engineering and security teams to define best practices for deployment, security, and scalability
* Mentor and guide junior DevOps team members and developers in adopting modern infrastructure practices
Required Skills and Qualifications:
* Experience with DevOps, SRE, or Platform Engineering role supporting cloud-native applications
* Extensive experience with Kubernetes (EKS), including Helm, networking, and security
* Deep proficiency in Terraform and Terragrunt for AWS infrastructure provisioning
* Proven expertise building pipelines with Harness, Git, and associated CI/CD tooling
* Strong scripting and automation skills using Python or Golang
* Solid understanding of AWS services: EKS, EC2, S3, IAM, RDS, VPC, Route53, CloudWatch, etc.
* Proficient with Linux systems, including performance tuning and troubleshooting
* Deep knowledge of networking fundamentals: DNS, TLS, HTTP/S, Load Balancing
* Experience with Ansible for configuration and provisioning
* Working knowledge of MySQL and PostgreSQL
* Strong experience in monitoring, logging, and alerting, particularly using New Relic
* Experience with Kafka, RabbitMQ, or other pub/sub systems is a plus
* Demonstrated ability to support and troubleshoot distributed products