An exciting opportunity is available for a Data Engineer to join a collaborative environment and contribute to the development of data infrastructure. The successful candidate will be responsible for designing, building, and maintaining large-scale data systems, including data pipelines, data warehouses, and data lakes.
The ideal candidate will work closely with data architects, data scientists, and other stakeholders to ensure that the entire data systems meet the needs of the business. This is a fully remote opportunity with the potential to become a permanent position.
Key Responsibilities:
* Design and build scalable automated testing solutions using Ruby/Selenium-based frameworks.
* Develop and maintain data pipelines using tools such as Apache Beam, Apache Spark, and AWS Glue.
* Design and implement data warehouses using tools like Amazon Redshift, Google BigQuery, and Snowflake.
* Work with data architects to design and implement data models and data architectures.
* Collaborate with data scientists to develop and deploy machine learning models and data products.
* Ensure data quality and integrity by developing and implementing data validation and data cleansing processes.
* Stay up-to-date with new technologies and trends in data engineering and make recommendations for adoption.
Qualifications:
* 5+ years of experience in data engineering or a related field.
* 2-4 years of experience in Ruby on Rails framework.
* Strong experience with programming languages such as Python, Java, and Scala.
* 3+ years of experience with data modeling and data architecture.
* Strong experience with data engineering tools such as Apache Beam, Apache Spark, AWS Glue, Amazon Redshift, Google BigQuery, and Snowflake.
* Experience with cloud-based data platforms such as AWS, GCP, or Azure.
* Experience with containerization using Docker and Kubernetes.
* Bachelor's degree in Computer Science, Engineering, or a related field.
Additional Requirements:
* Excellent collaboration and communication skills.
* Able to work independently and manage multiple tasks efficiently.