Back to jobs
Job Description
Overview
The Principal Data Engineer is a senior technical individual contributor (no direct people management required) responsible for driving technical excellence, architectural direction, and engineering best practices for the organization’s analytics data platform on Microsoft Azure.
This role partners with analytics engineers, BI developers, data scientists, product managers, and business stakeholders to design, build, and scale trusted, analytics‑ready datasets, semantic layers, and data products using Microsoft Fabric (OneLake, Lakehouse/Warehouse, Data Pipelines, Dataflows Gen2, Notebooks), enabling governed self‑service insights through Power BI. A key focus is helping execute the modernization roadmap to retire Azure Synapse workloads and transition teams to Fabric and Power BI standards.
The Principal Data Engineer operates with broad autonomy, influences multiple teams, and plays a critical role in shaping long‑term analytics strategy, metric governance, and Microsoft Fabric platform investments while remaining hands‑on with design, implementation, and code.
Key Responsibilities
Data Platform Leadership & Architecture
Lead the design and evolution of the analytics platform using Microsoft Fabric (OneLake, Lakehouse/Warehouse, Notebooks) to deliver curated data layers for self‑service reporting and advanced analytics.
Lead the Synapse retirement and transition to Fabric by inventorying current Synapse workloads, defining target patterns, sequencing migrations, and driving controlled cutovers (parallel runs, reconciliation, and rollback plans) to safely decommission Synapse components.
Define and uphold standards for dimensional/data modeling, transformation patterns, naming conventions, documentation, and reusable semantic definitions—including Power BI semantic models (datasets) and certified/shared assets.
Drive technical decision‑making for high‑impact analytics initiatives: Fabric Lakehouse/Warehouse curated marts, metric layers, KPI frameworks, and shared datasets used across multiple domains (including Finance integrations such as Hyperion and Oracle EBS where applicable).
Data Engineering & Pipeline Development
Build, review, and maintain analytics transformations and curated datasets in Microsoft Fabric using strong engineering rigor (version control, code review, and release discipline).
Lead implementations for new subject areas and data sources using Fabric Data Pipelines and Dataflows Gen2, including incremental loading strategies, slowly changing dimensions, and scalable aggregation patterns.
Ensure analytics data products meet high standards for correctness, freshness, and usability through data quality checks, reconciliation (including finance sources such as Hyperion and Oracle EBS when applicable), and clear documentation for Power BI consumers.
Champion automated testing for data, lineage/documentation, and performance optimization for BI workloads across Microsoft Fabric and Power BI semantic models.
Mentorship & Influence
Mentor analytics engineers and data engineers; elevate modeling quality, metric governance, and documentation practices across teams.
Set a strong example through technical depth, ownership, and disciplined delivery of high‑quality, well‑tested models and governed metrics.
Influence without authority across multiple teams to standardize modeling patterns, semantic definitions, and reusable metric layers.
Participate in hiring and technical interviews; assess candidates for strong SQL/modeling skills, stakeholder partnership, and practical analytics delivery.
Strategic Impact
Identify and proactively address systemic analytics risks such as inconsistent definitions, semantic drift, fragile transformations, and data quality gaps that erode trust.
Contribute to long‑term analytics platform roadmaps, including curated domain marts, semantic/metrics layers, and governance processes that scale self‑service, with clear milestones to retire Azure Synapse workloads and decommission legacy components.
Stay current with modern analytics engineering practices (dbt, metric stores, semantic layers, data quality/observability) and assess applicability to the organization.
Partner with Analytics, Finance/Operations, Product, and Application teams to deliver trusted datasets, KPI definitions, and Power BI semantic models aligned to business outcomes.
Required Qualifications
15+ years of professional data/software engineering experience, with significant time focused on data engineering, data management, and data platform work. The ideal candidate will have a proven transformation and delivery track record.
Deep expertise in analytics data modeling (dimensional, wide tables, SCD patterns) and modern analytical platform patterns (warehouse/lakehouse, curated marts, semantic layers).
Strong proficiency in SQL and one or more programming languages commonly used for data engineering (e.g., Python, Scala, Java).
Demonstrated experience migrating and decommissioning legacy data platforms (e.g., Azure Synapse) to modern equivalents (e.g., Microsoft Fabric), including workload inventory, remediation of gaps, and controlled cutover plans.
Strong understanding of Microsoft Fabric (OneLake, Lakehouse/Warehouse, Data Pipelines, Dataflows Gen2, Notebooks, SQL endpoints) plus CI/CD and DataOps practices for analytics assets.
Experience enabling governed self‑service analytics with Power BI (semantic modeling, performance tuning, and DAX fundamentals) and integrating or reconciling Finance reporting outputs (e.g., Hyperion) when applicable.
Proven ability to influence data architecture and technical direction across multiple teams, including governance, standards, and shared datasets.
Experience designing analytics data solutions for large‑scale enterprise domains (finance, operations, customer, product) with multiple upstream sources.
Experience with metric governance and analytics enablement capabilities (semantic layers/metric definitions, Power BI datasets, catalogs/lineage, data quality frameworks, and role‑based access control).
Track record of leading cross‑organizational analytics initiatives, aligning stakeholders on common definitions/metrics, and delivering reusable, well‑documented datasets that drive adoption.
