About the Role
We're seeking a seasoned Data Architect to spearhead the design and implementation of a new data warehouse instance for our major product line. This role involves building scalable pipelines, optimizing lakehouse performance, and integrating with diverse real-time and batch data sources across cloud services.
The ideal candidate is passionate about data architecture, thrives in fast-paced environments, and has a proven track record of setting up high-performance lakehouse platforms with strong foundations in data warehousing principles.
-----------------------------------
Key Responsibilities
* Design and deploy a tailored Databricks Lakehouse instance meeting our product-level data needs.
* Architect and implement robust data ingestion pipelines using Spark (PySpark/Scala) and Delta Lake.
* Integrate AWS-native services (S3, Glue, Athena, Redshift, Lambda) with Databricks for optimized performance and scalability.
* Define data models, optimize query performance, and establish warehouse governance best practices.
* Collaborate cross-functionally with product teams, data scientists, and DevOps to streamline data workflows.
* Maintain CI/CD, preferably DBX for data pipelines using GitOps and Infrastructure-as-Code.
* Monitor data jobs and resolve performance bottlenecks or failures across environments.
-----------------------------------
Required Skills & Qualifications
* Databricks / Lakehouse Architecture: End-to-end setup of Databricks workspaces and Unity Catalog; expertise in Delta Lake internals, file compaction, and schema enforcement; advanced PySpark/SQL skills for ETL and transformations.
* AWS Native Integration: Deep experience with AWS Glue, S3, Redshift Spectrum, Lambda, and Athena; IAM and VPC configuration knowledge for secure cloud integrations.
* Data Warehousing & Modeling: Strong grasp of modern dimensional modeling (star/snowflake schemas); experience setting up lakehouse design patterns for mixed workloads.
* Automation & DevOps: Familiarity with CI/CD for data engineering using tools like DBX, Terraform, GitHub Actions, or Azure DevOps; proficient in monitoring tools like CloudWatch, Datadog, or New Relic for data pipelines.
-----------------------------------
Bonus/Nice to Have
* Experience supporting gaming or real-time analytics workloads.
* Familiarity with Airflow, Kafka, or EventBridge.
* Exposure to data privacy and compliance practices (GDPR, CCPA).
-----------------------------------
Other Details
* Location: Latin America (LATAM) region - Remote.
* Length: 1+ Year.