Job Description:
Data Quality Scorecards in Databricks
* Design and validate profiling logic to rebuild five Data Quality Scorecards using SAP ECC data.
* Build rule-based data quality checks in PySpark, generating field-level and row-level results, and publishing business-facing scorecards in Power BI.
Responsibilities:
* Rebuild Data Quality scorecards in Databricks using PySpark and SQL.
* Develop profiling logic for nulls, distincts, pattern checks, and create curated DQ datasets for Power BI scorecards.
* Establish reusable DQ rule templates and standardized development patterns, working with SAP ECC data models.
* Support and mentor a junior developer on rule logic and development standards.
Qualifications:
* Strong Databricks engineering experience (PySpark, SQL, Delta Lake).
* Hands-on experience building Data Quality rules, frameworks, or scorecards, with expertise in profiling large datasets and implementing metadata-driven DQ logic.
* Ability to mentor, review code, and explain concepts clearly, with excellent communication skills in English.
* Familiarity with SAP ECC tables and key fields, with experience in Unity Catalog or Purview and exposure to Lakehouse Monitoring or DQX accelerators.