Back to jobs
WatchGuard Technologies

Senior Data Engineer

Seattle, WashingtonPosted 1 weeks ago
Full-timeremote

Job Description

We are looking for a Senior Data Engineer to join our growing data platform team. You will own the design, build, and reliability of our cloud-native data lakehouse — from raw ingestion through to analytics-ready Gold tables. You will work closely with data analysts, analytics engineers, and product stakeholders to deliver trusted data at speed, while championing data quality and observability as first-class concerns.

This role sits at the intersection of data engineering and platform engineering — you will be expected to think in architectures, not just pipelines.


What You Will Do

Data Platform & Pipeline Engineering

▸ Design, build, and maintain scalable ETL/ELT pipelines using Azure Data Factory (ADF) and Apache Airflow, processing structured and semi-structured data across the Medallion architecture (Bronze → Silver → Gold).

▸ Implement incremental load patterns, change data capture (CDC), and event-driven ingestion to ensure data freshness across the platform.

▸ Build and optimise Snowflake data warehouse objects — tables, views, dynamic tables, streams, tasks, and stored procedures — for performance and cost efficiency.

▸ Develop modular, tested dbt models aligned to each Medallion layer, enforcing consistent naming conventions, documentation, and lineage across all transformations.


Data Quality & Observability

▸ Embed automated data validation at every Medallion layer using Elementary (dbt's observability layer), ensuring anomaly detection, freshness checks, and schema drift alerts are in place before data reaches consumers.

▸ Define and enforce data contracts between producers and consumers — row count checks, null rate thresholds, referential integrity, and value domain validation.

▸ Build and maintain data quality dashboards to give engineering and business stakeholders real-time confidence in platform health.


Azure Cloud Infrastructure

▸ Manage and optimise Azure Data Lake Storage Gen2 (ADLS) — folder structures, lifecycle policies, access tiers, and partition strategies.

▸ Build and maintain Azure Functions and Azure Logic Apps for lightweight event-driven processing, orchestration triggers, and operational automation.

▸ Manage secrets, credentials, and environment-specific configuration securely using Azure Key Vault — no hardcoded credentials in pipelines or code.

▸ Contribute to infrastructure-as-code practices for provisioning Azure data services (Terraform or Bicep preferred).


Collaboration & Delivery

▸ Translate ambiguous business requirements into well-defined data models and pipeline designs, working with analysts and stakeholders to validate assumptions before build.

▸ Participate in code reviews, enforce standards, and mentor junior engineers on data engineering best practices.

▸ Support CI/CD adoption for pipeline and dbt model deployment across Dev / Test / Prod environments.


What We Are Looking For

Must-Have

▸ Snowflake: Snowflake

– Advanced SQL — window functions, CTEs, recursive queries, query profiling

– Snowflake-native features: streams, tasks, snowpipe, dynamic tables, row-level security

– Virtual warehouse tuning and credit cost optimisation

▸ dbt + Elementary: dbt + Elementary

– Writing, testing, and documenting production dbt models

– Elementary integration for data observability and anomaly detection

– dbt incremental strategies, snapshots, and semantic layer

▸ Azure Cloud: Azure Cloud

– Azure Data Factory — pipeline authoring, triggers, parameterisation, linked services

– ADLS Gen2 — zone/folder design, lifecycle management, Parquet/Delta partitioning

– Azure Key Vault — secret management, managed identities

– Azure Functions / Logic Apps — event-driven triggers and lightweight automation

▸ Airflow: Airflow

– DAG authoring, task dependencies, XCom, sensors, and connection management

– Airflow deployment and monitoring in cloud-hosted environments

▸ Python: Python

– Data pipeline scripting, PySpark basics, REST API integration

– Unit testing pipeline logic and transformation functions

▸ Data Quality & Medallion Architecture: Medallion Architecture:

– Hands-on experience implementing Bronze / Silver / Gold Medallion architecture

– Data validation checks at each layer — not just at the final Gold layer

– Schema evolution handling and SCD Type 2 dimension management

▸ 4+ years of professional data engineering experience with at least 2 years on Azure cloud data platforms.


Nice-to-Have

▸ Exposure to Snowflake Cortex, dbt Semantic Layer, or Boomi Data Hub for AI-assisted data enrichment within pipeline layers.

▸ Experience integrating LLM-based quality checks or AI-assisted anomaly detection into data workflows.

▸ Familiarity with Microsoft Fabric and OneLake as a complementary or future-state platform.

▸ Knowledge of data mesh or data product thinking and how it maps to Medallion layer ownership.

▸ Experience with Terraform or Bicep for Azure infrastructure provisioning.

We are looking for a Senior Data Engineer to join our growing data platform team. You will own the design, build, and reliability of our cloud-native data lakehouse — from raw ingestion through to analytics-ready Gold tables. You will work closely with data analysts, analytics engineers, and product stakeholders to deliver trusted data at speed, while championing data quality and observability as first-class concerns.

This role sits at the intersection of data engineering and platform engineering — you will be expected to think in architectures, not just pipelines.


What You Will Do

Data Platform & Pipeline Engineering

▸ Design, build, and maintain scalable ETL/ELT pipelines using Azure Data Factory (ADF) and Apache Airflow, processing structured and semi-structured data across the Medallion architecture (Bronze → Silver → Gold).

▸ Implement incremental load patterns, change data capture (CDC), and event-driven ingestion to ensure data freshness across the platform.

▸ Build and optimise Snowflake data warehouse objects — tables, views, dynamic tables, streams, tasks, and stored procedures — for performance and cost efficiency.

▸ Develop modular, tested dbt models aligned to each Medallion layer, enforcing consistent naming conventions, documentation, and lineage across all transformations.


Data Quality & Observability

▸ Embed automated data validation at every Medallion layer using Elementary (dbt's observability layer), ensuring anomaly detection, freshness checks, and schema drift alerts are in place before data reaches consumers.

▸ Define and enforce data contracts between producers and consumers — row count checks, null rate thresholds, referential integrity, and value domain validation.

▸ Build and maintain data quality dashboards to give engineering and business stakeholders real-time confidence in platform health.


Azure Cloud Infrastructure

▸ Manage and optimise Azure Data Lake Storage Gen2 (ADLS) — folder structures, lifecycle policies, access tiers, and partition strategies.

▸ Build and maintain Azure Functions and Azure Logic Apps for lightweight event-driven processing, orchestration triggers, and operational automation.

▸ Manage secrets, credentials, and environment-specific configuration securely using Azure Key Vault — no hardcoded credentials in pipelines or code.

▸ Contribute to infrastructure-as-code practices for provisioning Azure data services (Terraform or Bicep preferred).


Collaboration & Delivery

▸ Translate ambiguous business requirements into well-defined data models and pipeline designs, working with analysts and stakeholders to validate assumptions before build.

▸ Participate in code reviews, enforce standards, and mentor junior engineers on data engineering best practices.

▸ Support CI/CD adoption for pipeline and dbt model deployment across Dev / Test / Prod environments.


What We Are Looking For

Must-Have

▸ Snowflake: Snowflake

– Advanced SQL — window functions, CTEs, recursive queries, query profiling

– Snowflake-native features: streams, tasks, snowpipe, dynamic tables, row-level security

– Virtual warehouse tuning and credit cost optimisation

▸ dbt + Elementary: dbt + Elementary

– Writing, testing, and documenting production dbt models

– Elementary integration for data observability and anomaly detection

– dbt incremental strategies, snapshots, and semantic layer

▸ Azure Cloud: Azure Cloud

– Azure Data Factory — pipeline authoring, triggers, parameterisation, linked services

– ADLS Gen2 — zone/folder design, lifecycle management, Parquet/Delta partitioning

– Azure Key Vault — secret management, managed identities

– Azure Functions / Logic Apps — event-driven triggers and lightweight automation

▸ Airflow: Airflow

– DAG authoring, task dependencies, XCom, sensors, and connection management

– Airflow deployment and monitoring in cloud-hosted environments

▸ Python: Python

– Data pipeline scripting, PySpark basics, REST API integration

– Unit testing pipeline logic and transformation functions

▸ Data Quality & Medallion Architecture: Medallion Architecture:

– Hands-on experience implementing Bronze / Silver / Gold Medallion architecture

– Data validation checks at each layer — not just at the final Gold layer

– Schema evolution handling and SCD Type 2 dimension management

▸ 4+ years of professional data engineering experience with at least 2 years on Azure cloud data platforms.


Nice-to-Have

▸ Exposure to Snowflake Cortex, dbt Semantic Layer, or Boomi Data Hub for AI-assisted data enrichment within pipeline layers.

▸ Experience integrating LLM-based quality checks or AI-assisted anomaly detection into data workflows.

▸ Familiarity with Microsoft Fabric and OneLake as a complementary or future-state platform.

▸ Knowledge of data mesh or data product thinking and how it maps to Medallion layer ownership.

▸ Experience with Terraform or Bicep for Azure infrastructure provisioning.

Senior Data Engineer at WatchGuard Technologies | Renata