High-performance data processing pipelines are designed, built, and maintained using Pandas and Polars. RESTful APIs are developed and exposed using FastAPI or similar frameworks. Normalized Parquet files from multiple upstream sources are consumed and processed to generate dynamic Excel reports.
Key Responsibilities:
* Optimize report generation logic for speed and scalability.
* Integrate with messaging and storage mechanisms (e.g., Service Bus, Storage Accounts).
* Collaborate on infrastructure-as-code automation using Bicep (or similar IaC tools).
* Participate in design discussions for future migration to Snowflake and/or a data lake architecture.
* Contribute to CI/CD pipelines using GitHub Actions.
Required Skills and Experience:
* Strong proficiency in Python for data processing.
* Experience building backend services or APIs using frameworks like FastAPI.
* Solid understanding of data modeling principles (Star Schema) and handling normalized datasets.
* Familiarity with enterprise messaging patterns and data integration from various sources (API-based and file-based).
* Experience working with GitHub and CI/CD pipelines (GitHub Actions or similar).
* Infrastructure-as-Code experience with Bicep or comparable tools (Terraform, AWS CDK).
* Comfort with spec-driven development and leveraging AI tools like GitHub Copilot for scaffolding.