Big Data Architect
The ideal candidate will be responsible for designing, building, and maintaining data systems, including data pipelines, data warehouses, and data lakes.
* Main Responsibilities:
* Designing and implementing large-scale data systems
* Implementing data warehouses using Amazon Redshift, Google BigQuery, and Snowflake
* Designing and implementing scalable automated testing solutions using Ruby/Selenium-based frameworks
* Developing and maintaining data pipelines using Apache Beam, Apache Spark, and AWS Glue
* Developing and maintaining data lakes using Apache Hadoop, Apache Spark, and Amazon S3
* Working with data architects to design and implement data models and data architectures
* Collaborating with data scientists to develop and deploy machine learning models and data products
* Ensuring data quality and integrity by developing and implementing data validation and data cleansing processes
Requirements:
* Bachelor's degree in Computer Science, Engineering, or a related field
* 5+ years of experience in data engineering or a related field
* 2 - 4 years of experience in Ruby products, including Ruby on Rails framework
* 5+ years of experience with programming languages such as Python, Java, and Scala
* 3+ years of experience with data modeling and data architecture
* 3+ years of experience with data engineering tools such as Apache Beam, Apache Spark, AWS Glue, Amazon Redshift, Google BigQuery, and Snowflake
* Strong experience with data warehousing and data lakes
* Strong collaboration and communication skills
Nice to Have:
* Experience with machine learning and data science
* Experience with cloud-based data platforms such as AWS, GCP, or Azure
* Experience with containerization using Docker and Kubernetes
* Experience with agile development methodologies such as Scrum or Kanban
* Experience with data governance and data security
About the Opportunity:
This is a fully remote opportunity with the potential to become a permanent position. The successful candidate will work closely with data architects, data scientists, and other stakeholders to ensure that the entire data systems meet the needs of our business.