Job Summary
The ideal candidate will have expertise in BigQuery, Dataflow (Apache Beam), Cloud Storage, and Pub/Sub, along with proficiency in SQL, Oracle Database, and PostgreSQL. They should also possess knowledge of orchestration using Cloud Composer (Airflow) and hands-on experience with CI/CD applied to data pipelines (Git, Terraform).
We are seeking a highly skilled professional to design and implement scalable, high-performance ETL/ELT pipelines, ensure data quality, integrity, and security end-to-end, create and maintain data models aligned with business needs, collaborate with data scientists, analysts, and software engineers to support advanced analytics and machine learning use cases.
Key responsibilities include automating ingestion, transformation, and data delivery processes, monitoring and optimizing cost and performance of GCP resources, implementing best practices for DataOps and Data Governance, and developing a deep understanding of distributed architectures and Data Lake layers.
Additionally, the ideal candidate will have experience with cloud cost and performance optimization, GCP certifications, knowledge of Kubernetes (GKE) and APIs on GCP, and previous involvement with Machine Learning pipelines (Vertex AI, AI Platform). Strong communication skills are essential for this role, as the successful candidate will be working closely with cross-functional teams to drive business outcomes.