🚀 Data Engineer (Databricks & AWS)📍 Remote (Latin America / Europe) | 🕐 9 AM - 5 PM EST | 💼 Full-timeAt CloudGeometry, we partner with industry leaders like AWS, Google, and Databricks to deliver cutting-edge cloud-native solutions. We are looking for a Senior Data Engineer to join our flagship project: a modern Data Platform for the life sciences industry, supporting global leaders like Pfizer, Moderna, and Novartis in developing innovative RNA-based solutions using cloud computing and advanced AI.If you are an experienced data engineer who thrives in high-impact environments, zeroes out legacy systems, and wants to play a key technical leadership role in building scalable lakehouse architectures, let’s talk!🎯 Key Responsibilities- Pipeline Engineering: Design, develop, and optimize high-performance ETL pipelines within Databricks to connect analytics-ready data back to operational services.- Architecture Leadership: Lead technical architecture discussions with engineering, product managers, and data scientists to implement advanced analytics.- Workflow Optimization: Build, fine-tune, and monitor Databricks workflows to ensure system reliability, performance, and data integrity.- Data Quality & Security: Collaborate with ML teams to ensure secure, rigorous, and accurate data ingestion across all processing stages.- Agile Execution: Actively participate in daily Scrum ceremonies within a globally distributed engineering team.🛠️ Technical Requirements & Stack1. Core Data Engineering- Databricks Ecosystem: 2+ years of hands-on experience (Delta tables/Iceberg, Spark jobs, MLflow, Unity Catalog, Model Registry).- Architecture: Expert-level understanding of modern Lakehouse architectural design principles.- Languages: Expert-level Python (for data processing/ETL) AND TypeScript / Node.Js (for backend services using HapiJS, Zod, and Jest).2. Cloud Infrastructure & DevOps (AWS)- Compute & Storage: ECS (Fargate/EC2), Lambda, S3, and Athena.- Messaging & Orchestration: SQS/SNS and Airflow.- DevOps & CI/CD: GitHub Actions, CodeBuild, Docker, and repository templates via Cruft.3. Data Stores & MLOps- Databases: PostgreSQL (ACID/Migrations), DynamoDB (High-scale Key-Value), and Redis (Caching/Rate limiting).- Search: OpenSearch / Elasticsearch for full-text search and aggregations.- GenAI: Practical knowledge of LLMs, agents, function calling, and RAG architectures.📋 Qualifications- Experience: 5+ years in software development with a strong focus on data engineering/analytics teams.- Senior Autonomy: Proven ability to challenge decisions, propose architectural improvements, and deliver complex features end-to-end.- Communication: Exceptional English skills (written and spoken) to articulate complex data ideas to global stakeholders.- Availability: Required online presence from 9 AM to 5 PM EST.⭐ Nice to Have:- Professional Databricks or AWS certifications.- Experience building internal SDKs or developer experience tooling.- Experience working directly alongside Data Scientists and ML Developers.🎁 What We Offer (Our Commitment to You)- 💰 Comprehensive compensation and benefits package.- ⚡ Zero legacy systems – work exclusively with cutting-edge technologies.- 📚 Continuous Learning: Extensive training, certifications, hackathons, and Udemy access.- 🛠️ Premium Tooling: Developer Pro access to ClaudeCode, Codex, and AntiGravity.- 🌍 Top-tier Culture: A collaborative, supportive environment with global experts.📩 Ready to build the future of Life Sciences?Click Apply or send us your resume. Let’s build something massive together!#DataEngineering #Databricks #AWS #Python #TypeScript #RemoteJobs #Lakehouse