Key Responsibilities:
\
\
* Design and build large-scale data systems for efficient data processing. \
* Implement data warehouses using tools such as Amazon Redshift, Google BigQuery, and Snowflake to manage complex data sets. \
* Develop scalable automated testing solutions using Ruby/Selenium-based frameworks to ensure high-quality data. \
* Create and maintain data pipelines using tools such as Apache Beam, Apache Spark, and AWS Glue to optimize data flow. \
* Develop and maintain data lakes using tools such as Apache Hadoop, Apache Spark, and Amazon S3 to store and process large datasets. \
* Collaborate with data architects to design and implement data models and data architectures that meet business needs. \
* Work with data scientists to develop and deploy machine learning models and data products that drive business growth. \
* Ensure data quality and integrity by implementing data validation and cleansing processes to prevent data loss or corruption. \
* Communicate effectively with other teams to ensure that data systems meet the business's requirements.