Exciting Data Engineering Opportunity
We are seeking an experienced Data Engineer to join our team and contribute to the design, development, and maintenance of large-scale data systems.
This is a fully remote opportunity with the potential to become a permanent position. As a Data Engineer, you will work closely with data architects, data scientists, and other stakeholders to ensure that the entire data systems meet the needs of our business.
Main Responsibilities:
* Data System Design and Development: Design, build, and maintain large-scale data systems, including data pipelines, data warehouses, and data lakes.
* Data Warehousing and Lakes: Design and implement data warehouses using tools such as Amazon Redshift, Google BigQuery, and Snowflake, as well as develop and maintain data lakes using tools like Apache Hadoop, Apache Spark, and Amazon S3.
* Automated Testing Solutions: Design and implement scalable automated testing solutions using Ruby/Selenium-based frameworks.
* Data Pipelines: Develop and maintain data pipelines using tools such as Apache Beam, Apache Spark, and AWS Glue.
* Data Quality and Integrity: Ensure data quality and integrity by developing and implementing data validation and data cleansing processes.
* Collaboration: Collaborate with data architects, data scientists, and other teams to ensure that data systems meet the business's needs.
Qualifications:
* Experience: 5+ years of experience in data engineering or a related field.
* Ruby Experience: 2-4 years of experience in Ruby products, including Ruby on Rails framework.
* Programming Languages: 5+ years of experience with programming languages such as Python, Java, and Scala.
* Data Modeling and Architecture: 3+ years of experience with data modeling and data architecture.
* Data Engineering Tools: 3+ years of experience with data engineering tools such as Apache Beam, Apache Spark, AWS Glue, Amazon Redshift, Google BigQuery, and Snowflake.
* Education: Bachelor's degree in Computer Science, Engineering, or a related field.
Nice to Have:
* Machine Learning and Data Science: Experience with machine learning and data science.
* Cloud-Based Data Platforms: Experience with cloud-based data platforms such as AWS, GCP, or Azure.
* Containerization: Experience with containerization using Docker and Kubernetes.
* Agile Development Methodologies: Experience with agile development methodologies such as Scrum or Kanban.
* Data Governance and Security: Experience with data governance and data security.