Job Title: Lead DevOps EngineerLocation: RemoteExperience Level: 10+ yearsAbout the RoleWe are seeking a highly experienced Lead DevOps Engineer to spearhead our cloud infrastructure and DevOps initiatives. In this role, you will lead a small but growing team of engineers and drive the strategic direction of our DevOps practices. The ideal candidate has a proven track record of building reliable, scalable, and secure platforms, coupled with the leadership skills to mentor engineers and align infrastructure with business goals.Key ResponsibilitiesLeadership & StrategyLead, mentor, and grow a DevOps engineering team, fostering a culture of automation, reliability, and continuous improvement.Define best practices, standards, and architectural direction for DevOps across the organization.Partner with engineering, security, and product teams to ensure infrastructure supports business needs.Cloud & InfrastructureDesign, implement, and manage large-scale cloud infrastructure (AWS, Azure, or GCP).Architect and maintain infrastructure as code (IaC) using tools like Terraform, Pulumi, or CloudFormation.Establish and enforce high-availability and disaster recovery strategies.Automation & CI/CDBuild and optimize CI/CD pipelines for efficient, secure, and reliable software delivery.Automate operational tasks, deployments, monitoring, and scaling.Ensure fast feedback loops and minimal downtime through advanced release strategies (blue/green, canary).Reliability & ObservabilityImplement and manage monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, ELK/EFK, Datadog, New Relic).Drive service-level objectives (SLOs) and error budgets to enhance reliability.Perform root cause analysis and lead postmortems to prevent recurrence.Security & ComplianceEmbed security practices into infrastructure and CI/CD pipelines (“shift-left” security).Ensure compliance with industry standards (ISO 27001, SOC2, HIPAA, GDPR, etc.).Implement secrets management, access controls, and vulnerability scanning.Collaboration & MentorshipProvide technical guidance, code reviews, and hands-on support to team members.Collaborate cross-functionally with developers, QA, and operations teams.Advocate for DevOps culture, evangelizing best practices throughout the organization.Required Qualifications10+ years of experience in DevOps, Site Reliability Engineering (SRE), or Platform Engineering.5+ years of leadership experience (team lead, manager, or architect role).Expert-level proficiency in at least one major cloud provider (AWS, Azure, or GCP).Strong hands-on experience with:IaC: Terraform, Pulumi, or CloudFormationCI/CD: Jenkins, GitHub Actions, GitLab CI, ArgoCD, Spinnaker, etc.Containers & Orchestration: Docker, Kubernetes, HelmObservability Tools: Prometheus, Grafana, ELK/EFK, Datadog, Splunk, New RelicSecurity Tools: HashiCorp Vault, AWS IAM, OPA, Prisma, Aqua SecurityProven track record of designing and maintaining large-scale, highly available, secure systems.Strong background in Linux/Unix systems administration and networking fundamentals.Proficiency in at least one programming/scripting language (Python, Go, Bash, etc.).Excellent communication, leadership, and collaboration skills.Nice to HaveExperience with hybrid or multi-cloud environments.Familiarity with service meshes (Istio, Linkerd) and API gateways.Background in cost optimization and FinOps practices.Contributions to open-source DevOps or cloud-native projects.