We're seeking a seasoned data architect to design and implement scalable, high-performance ETL/ELT pipelines on GCP.
The ideal candidate will have hands-on experience with BigQuery, Dataflow (Apache Beam), Cloud Storage, and Pub/Sub. A strong understanding of orchestration using Cloud Composer (Airflow) is also necessary, as well as proficiency in CI/CD applied to data pipelines (Git, Terraform). Additionally, the candidate should have knowledge of cloud cost and performance optimization, GCP certifications, and expertise in Kubernetes (GKE) and APIs on GCP. Experience with Machine Learning pipelines (Vertex AI, AI Platform) is highly desirable.
The successful candidate will lead the development of efficient, secure, and scalable data architectures on GCP, ensuring seamless integration between various services. They will collaborate closely with cross-functional teams to design and implement advanced analytics and machine learning solutions that drive business growth and improve decision-making processes.
Key responsibilities include:
* Designing and implementing data architectures on GCP using services such as BigQuery, Dataflow, Dataproc, Pub/Sub, Cloud Storage, and Composer
* Developing and optimizing scalable, high-performance ETL/ELT pipelines
* Ensuring data quality, integrity, and security end-to-end
* Maintaining data models aligned with business needs
* Collaborating with data scientists, analysts, and software engineers to support advanced analytics and machine learning use cases
* Automating ingestion, transformation, and data delivery processes
* Monitoring and optimizing cost and performance of GCP resources
* Implementing best practices for DataOps and Data Governance