Job Title: Data Warehouse Architect
About This Role
The ideal candidate is a seasoned data professional who thrives in fast-paced environments and has a proven track record of delivering high-performance data platforms on Databricks.
This role requires someone who is passionate about data architecture, experienced in setting up lakehouse instances tailored to client needs, and skilled in designing robust data ingestion pipelines using Spark (PySpark/Scala) and Delta Lake.
The successful candidate will also have expertise in integrating AWS-native services with Databricks for optimized performance and scalability. Additionally, they must be able to define data models, optimize query performance, and establish warehouse governance best practices.
Cross-functional collaboration is key in this role, as the ideal candidate will work closely with product teams, data scientists, and DevOps to streamline data workflows and maintain CI/CD pipelines using GitOps and Infrastructure-as-Code.
Key Responsibilities
* Design and deploy a new Databricks Lakehouse instance tailored to the client's product-level data needs.
* Architect and implement robust data ingestion pipelines using Spark (PySpark/Scala) and Delta Lake.
* Integrate AWS-native services (S3, Glue, Athena, Redshift, Lambda) with Databricks for optimized performance and scalability.
* Define data models, optimize query performance, and establish warehouse governance best practices.
* Collaborate cross-functionally with product teams, data scientists, and DevOps to streamline data workflows.
* Maintain CI/CD pipelines using GitOps and Infrastructure-as-Code.