Job Overview
Develop a data-driven approach to build an efficient and scalable AWS Serverless DataLake architecture that supports various data processing workflows.
Implement data ingestion pipelines and integration processes, ensuring seamless data transfer from multiple sources into the DataLake.
Employ data transformation and enrichment techniques using AWS Lambda, Glue, or similar technologies to maintain data quality and consistency.
Collaborate with data scientists and analysts to understand their requirements and design appropriate data models and schemas in the DataLake.
Optimize data storage and retrieval mechanisms using AWS services like S3, Athena, Redshift, or DynamoDB for high-performance access.
Proactively monitor and troubleshoot the DataLake infrastructure, resolving performance issues and data processing errors.
Stay up-to-date with emerging AWS services and technologies to enhance the DataLake architecture, improve efficiency, and drive innovation.
Mentor junior engineers, promoting growth and adherence to best practices.
Work collaboratively with cross-functional teams to understand business needs, prioritize tasks, and deliver high-quality solutions within defined timelines.
Key Responsibilities
* Design and implement data ingestion and integration pipelines
* Employ data transformation and enrichment techniques
* Collaborate with data scientists and analysts
* Optimize data storage and retrieval mechanisms
* Monitor and troubleshoot the DataLake infrastructure
* Stay up-to-date with emerging AWS services and technologies
* Mentor junior engineers
* Work collaboratively with cross-functional teams
Requirements
* Proven experience in designing and implementing data architectures
* Expertise in AWS services, including Lambda, Glue, and S3
* Strong understanding of data transformation and enrichment techniques
* Ability to collaborate effectively with data scientists and analysts
* Experience with monitoring and troubleshooting complex systems
* Excellent communication and mentoring skills