Key Responsibilities
Design and implement a data warehouse instance for a major product line, utilizing expertise in Databricks and AWS-native data services. The ideal candidate will spearhead the design and implementation of a new Databricks Lakehouse instance tailored to the client's product-level data needs.
Responsibilities include architecting and implementing robust data ingestion pipelines using Spark (PySpark/Scala) and Delta Lake, integrating AWS-native services with Databricks for optimized performance and scalability. Additionally, define data models, optimize query performance, and establish warehouse governance best practices.
Cross-functional collaboration with product teams, data scientists, and DevOps is essential to streamline data workflows. Maintain CI/CD, preferably DBX, for data pipelines using GitOps and Infrastructure-as-Code. Monitor data jobs and resolve performance bottlenecks or failures across environments.
Core Skills: Data warehousing, Databricks, AWS-native services, data ingestion pipelines, Spark, Delta Lake, CI/CD, GitOps, Infrastructure-as-Code.
Why This Role Matters: As a key member of our team, you will be responsible for designing and implementing a scalable and efficient data warehouse solution that meets the evolving needs of our clients.