Back to jobs
Job Description
Purpose
We are looking for a Principal Data Engineer to sit at the intersection of data engineering and applied data science. You will own the design, development, and operation of the platforms and pipelines that power our data science capabilities — ensuring data flows reliably from source systems through to analysis, and business consumption.
This role is roughly 75% data engineering and 25% data science and is ideal for someone who builds with engineering rigor but thinks with a data science mindset; someone who is energized by building platforms that make AI real in an organization. The right candidate is curious by nature — you explore out-of-the-box ideas and stay current with the fast-moving AI/Machine Learning (ML) landscape.
You’ll work directly with business analysts, product owners, business end-users, engineering and application teams, and our own data/platform engineering teams. A consultative communication style is critical as shared outcomes across technology and business are the expectation.
Responsibilities
Platform Architecture & Strategy
Define the long-term technical direction for the data science platform and integration with existing ELT pipelines
Ensure platforms are scalable, reliable, secure, and cost-efficient at enterprise scale
Evaluate and adopt emerging tools in the modern data and ML stack
Data Engineering Development
Design, develop, and optimize ETL pipelines and outbound data feeds
Develop and follow templates and engineering patterns to reduce the time-to-deploy new data assets or changes to an existing data model or analytics solutions
Partner with key business teams to understand their data needs and assist them in building appropriate data solutions to meet their business needs
Data Science Development
Design, build, and optimize end-to-end data science pipelines — from raw data ingestion through feature engineering, model training, and inference serving
Contribute to MLOps practices including model versioning and monitoring, supporting the transition of data science work into production
Technical Leadership & Mentorship
Provide technical guidance to data engineers
Conduct code reviews and champion engineering best practices across workstreams
Lead without direct authority, influencing cross-functional teams across data engineering, analytics and product owners
Data Governance & Quality
Establish best practices for data quality, lineage, privacy, and security across data engineering and science pipelines
Ensure model inputs and outputs are auditable, reproducible, and compliant with data governance standards
Stakeholder Management
Partner with data engineering, product owners, and software engineers to align platform capabilities with organizational AI/ML goals
Translate complex technical concepts into clear, actionable insights for non-technical stakeholders
About You
Bachelor’s degree in computer science, engineering, mathematics, or a related field, OR 7+ years of equivalent verifiable experience, skillset, and record of accomplishment
Experience in a Principal or Senior Data Engineer role with direct involvement in ML platform or Data Science work
Proficiency in an analytics/BI tool such as Power BI
Data Engineering experience:
Modern data stack technologies — Databricks (strongly preferred), Snowflake, Spark
Inbound/outbound transportation of data with APIs and FTPs
MPP databases such as Databricks, Snowflake, BigQuery, Teradata, or Azure Synapse
Cloud platforms — AWS, Azure, or GCP
Python and SQL
ML & Data Science experience
Building and deploying ML models (classification, regression, forecasting, NLP, or similar)
Familiarity with ML frameworks such as scikit-learn, XGBoost, PyTorch, or TensorFlow
MLflow or similar tools for experiment tracking, model registry, and deployment
Understanding of feature engineering, model evaluation, and common ML failure modes
Architecture experience
Strong understanding of data modelling techniques (Kimball, Data Vault) and distributed systems
Familiarity with feature stores, training pipelines, and batch/real-time inference architectures
