Project Description:
We are seeking a skilled and hands-on Data Engineer with proven experience in Databricks, Apache Kafka, and real-time data streaming solutions.
Responsibilities:
- Design and implement scalable data pipelines using Databricks and Kafka
- Build and maintain real-time streaming solutions for high-volume data
- Collaborate with cross-functional teams to integrate data flows into broader systems
- Optimize performance and reliability of data processing workflows
- Ensure data quality, lineage, and compliance across streaming and batch pipelines
- Participate in agile development processes and contribute to technical documentation
Mandatory Skills Description:
- +5 years of experience in data engineering roles
- Proven expertise with Databricks (Spark, Delta Lake, notebooks, performance tuning)
- Strong hands-on experience with Apache Kafka (topics, producers/consumers, schema registry)
- Solid understanding of streaming frameworks (e.g., Spark Structured Streaming, Flink, or similar)
- Experience with cloud platforms (AWS, Azure, or GCP)
- Proficiency in Python or Scala for data pipeline development
- Familiarity with CI/CD pipelines (GitLab, Jenkins) and agile tools (Jira)
- Exposure to data lakehouse architectures and best practices
- Knowledge of data governance, security, and observability
Languages:
* English: C1 Advanced