Data Quality Engineer
We are seeking a highly skilled Data Quality Engineer to join our team. As a Data Quality Engineer, you will play a key role in rebuilding five data quality scorecards using SAP ECC data available in the Lakehouse.
Your primary responsibility will be to design and validate profiling logic, build rule-based data quality checks in PySpark, generate field-level and row-level results, and publish business-facing scorecards in Power BI. You will also define reusable templates, naming conventions, and repeatable processes to support future scorecard expansion and help transition the organization away from Informatica IDQ.
Responsibilities:
* Rebuild data quality scorecards in Databricks
* Develop profiling logic (nulls, distincts, pattern checks)
* Build PySpark-based data quality rules and row/column-level metrics
* Create curated data quality datasets for Power BI scorecards
* Establish reusable data quality rule templates and standardized development patterns
* Work with SAP ECC data models
* Support and mentor a junior developer on rule logic and development standards
Qualifications:
* Strong Databricks engineering experience (PySpark, SQL, Delta Lake)
* Hands-on experience building data quality rules, frameworks, or scorecards
* Experience in profiling large datasets and implementing metadata-driven data quality logic
* Ability to mentor, review code, and explain concepts clearly
* Excellent communication skills in English
* Familiarity with SAP ECC tables and key fields (preferred)
* Experience with Unity Catalog or Purview (nice to have)
* Exposure to Lakehouse Monitoring or DQX accelerators (bonus)
If you are passionate about data quality, strong in Databricks/PySpark, and enjoy building reusable data quality capabilities, we encourage you to apply for this exciting opportunity.