Seeking a skilled Data Engineer to build robust data pipelines and ML infrastructure in a fast-paced fintech environment.
Key responsibilities include designing, developing, and maintaining scalable data pipelines using Python, Airflow, and PySpark to process large volumes of financial transaction data. Implementing and optimizing MLOps infrastructure on AWS to automate the full machine learning lifecycle from development to production is also crucial.
The ideal candidate will have 3-5 years of experience in Data Engineering with a focus on MLOps in production environments. Strong proficiency in Python programming and data processing frameworks (PySpark) is required. Experience with workflow orchestration tools, particularly Airflow, and hands-on experience with the AWS stack, especially SageMaker, Lambda, S3, and other relevant services, are highly desirable.
A working knowledge of machine learning model deployment and monitoring in production, as well as experience with data modeling and database systems (SQL and NoSQL), are essential. Familiarity with containerization (Docker) and CI/CD pipelines is also beneficial. The successful candidate will possess excellent problem-solving skills and be able to work in a dynamic and growth-oriented environment.
* Design, develop, and maintain scalable data pipelines using Python, Airflow, and PySpark.
* Implement and optimize MLOps infrastructure on AWS to automate the full machine learning lifecycle.
* Build and maintain deployment pipelines for ML models using SageMaker and other AWS services.
* Collaborate with data scientists and business stakeholders to implement machine learning solutions for fraud detection, risk assessment, and financial forecasting.
* Ensure data quality, reliability, and security across all data engineering workloads.