Job Overview
The ideal candidate will have experience in architecting and maintaining CI/CD pipelines for Data Lake components, including Data Factory, Databricks, Functions, Synapse, Spark workloads, and storage configurations.
This role involves implementing Infrastructure-as-Code with Terraform for provisioning storage accounts, networking, compute, identity, and security layers. The successful candidate must be able to enforce branching discipline, artifact integrity, automated testing, and controlled release gates.
We are looking for someone who can automate environment provisioning, ACL management, key rotation, lifecycle policies, and cluster configuration. Additionally, the candidate should be able to integrate DevOps processes with enterprise security: RBAC, managed identities, Key Vault, private networking, encryption controls.
The successful candidate will build observability: logging, metrics, alerting, dashboards for pipelines and platform components. They will also maintain backup, restoration, disaster-recovery patterns and test them for reliability.
We require the ability to eliminate configuration drift through standardized templates and environment baselines. The candidate should also be able to maintain and optimize agents, service connections, and deployment runtimes.
As part of this role, the successful candidate will perform incident response and root-cause analysis, document systemic fixes. They will deliver reusable automation modules for data engineering teams and optimize workload performance and cost within the Data Lake environment.
The successful candidate will ensure compliance with governance, audit requirements, and data protection mandates. Finally, they will drive continuous reduction of manual operational work through automation.