Job Overview
As a Data Engineer, you will play a pivotal role in designing and implementing scalable data processing systems using AWS technologies and serverless architectures. Your expertise will be instrumental in building efficient data pipelines that support various data processing workflows.
Key Responsibilities:
* Design and implement an AWS Serverless DataLake architecture to efficiently handle large volumes of data and ensure smooth data transfer from various sources into the DataLake.
* Develop robust data ingestion pipelines and integration processes to guarantee reliable data transfer into the DataLake.
* Implement data transformation and enrichment processes using AWS Lambda, Glue, or similar serverless technologies to maintain data quality and consistency.
* Collaborate with data scientists and analysts to understand their data requirements and design appropriate data models and schemas in the DataLake.
* Optimize data storage and retrieval mechanisms, leveraging AWS services such as S3, Athena, Redshift, or DynamoDB, to provide high-performance access to the data.
* Monitor and troubleshoot the DataLake infrastructure, identifying and resolving performance bottlenecks, data processing errors, and other issues.
Requirements:
* Extensive experience (5+ years) working as a Data Engineer with a strong focus on AWS technologies and serverless architectures.
* In-depth knowledge of AWS services such as S3, Lambda, Glue, Athena, Redshift, and DynamoDB.
* Strong programming skills in languages like Python, Java, or Scala, along with experience using SQL for data manipulation and querying.
* Hands-on experience with data integration and ETL tools, such as AWS Glue or Apache Spark, for transforming and processing data.
* English language proficiency.