Job Summary
This is an exciting opportunity for a skilled Data Engineer to join our team and contribute to the design, building, and maintenance of large-scale data systems. The successful candidate will be responsible for designing and implementing data pipelines, data warehouses, and data lakes using tools such as Apache Beam, Apache Spark, and AWS Glue. Additionally, they will work closely with data architects and other stakeholders to ensure that the entire data system meets the needs of our business.
About the Role
The Data Engineer will be responsible for:
* Designing and implementing scalable automated testing solutions using Ruby/Selenium-based frameworks.
* Developing and maintaining data pipelines using tools such as Apache Beam, Apache Spark, and AWS Glue.
* Designing and implementing data warehouses using tools such as Amazon Redshift, Google BigQuery, and Snowflake.
* Developing and maintaining data lakes using tools such as Apache Hadoop, Apache Spark, and Amazon S3.
* Working with data architects to design and implement data models and data architectures.
* Collaborating with data scientists to develop and deploy machine learning models and data products.
* Ensuring data quality and integrity by developing and implementing data validation and data cleansing processes.
* Collaborating with other teams to ensure that data systems meet the business's needs.
Requirements
To be considered for this role, you will need to have:
* 5+ years of experience in data engineering or a related field.
* 2-4 years of experience in Ruby products, including Ruby on Rails framework.
* 5+ years of experience with programming languages such as Python, Java, and Scala.
* 3+ years of experience with data modeling and data architecture.
* 3+ years of experience with data engineering tools such as Apache Beam, Apache Spark, AWS Glue, Amazon Redshift, Google BigQuery, and Snowflake.
* Strong experience with data warehousing and data lakes.
* Strong experience with data validation and data cleansing.
* Strong collaboration and communication skills.
* Bachelor's degree in Computer Science, Engineering, or a related field.
Nice to Have
It would be beneficial if you have experience with:
* Machine learning and data science.
* Cloud-based data platforms such as AWS, GCP, or Azure.
* Containerization using Docker and Kubernetes.
* Agile development methodologies such as Scrum or Kanban.
* Data governance and data security.
What We Offer
We offer a fully remote opportunity with the potential to become a permanent position. As a Data Engineer, you will have the opportunity to work with a collaborative environment and contribute to the growth and success of our organization.