Core Skills & Knowledge Areas Python Data manipulation: Proficiency with libraries like pandas, numpy, and pyarrow. Scripting & automation: Writing reusable, modular scripts for data ingestion and transformation. APIs: Consuming and building RESTful APIs for data exchange. Testing: Unit testing with pytest or unittest. Cloud Platforms AWS / Azure / GCP: Familiarity with services like: AWS: S3, Lambda, Glue, Redshift, EMR Azure: Data Factory, Blob Storage, Synapse GCP: BigQuery, Cloud Functions, Dataflow Infrastructure as Code (IaC): Tools like Terraform or CloudFormation. Security & IAM: Managing access and permissions. Back-End Development Databases: SQL (PostgreSQL, MySQL), NoSQL (MongoDB, DynamoDB). APIs: Building data services using frameworks like Flask, FastAPI, or Django. CI/CD: Familiarity with Git, Docker, Jenkins, or GitHub Actions. ETL / ELT Pipelines Pipeline orchestration: Tools like Apache Airflow, Prefect, or Luigi. Data transformation: Using SQL, dbt, or Python scripts. Batch vs Streaming: Understanding of Kafka, Spark Streaming, or Flink. Monitoring & Logging: Ensuring data quality and pipeline reliability. Tools & Technologies Programming: Python, SQL. Cloud: AWS, Azure, GCP. Orchestration: Airflow, Prefect. Databases: PostgreSQL, BigQuery, Redshift. Data Lakes: S3, Azure Data Lake. Containers: Docker, Kubernetes. Version Control: Git, GitHub/GitLab. ✅ Soft Skills & Other Requirements Problem-solving: Ability to debug and optimize data workflows. Teamwork: Collaborating with Data Scientists, Analysts, and DevOps. Some of our perks Fresh fruit sometimes, spoiled fruit all the time. You can work from anywhere, including your home. Flexible hours. Team lunches, Bday celebrations, happy hours. Wellness program and company retreats. English lessons. Courses and training.