Senior Data Scientist – Manufacturing Intelligence, Machine Learning & AI
Job Description
We are looking for a Senior Data Scientist to help build advanced analytics, machine learning, and AI solutions for manufacturing operations.
This role will focus on using factory data to detect anomalies, improve quality, reduce downtime, optimize throughput, and support reusable data models that connect fragmented manufacturing systems into a common intelligence layer.
The ideal candidate has strong applied machine learning skills, practical experience working with complex operational data, and the ability to partner with manufacturing, data engineering, platform, and software teams to move analytical solutions toward production.
This is not a pure research role. We are looking for someone who can move from problem framing to data understanding, model development, validation, stakeholder alignment, and production support. The candidate should be able to learn unfamiliar domains quickly, challenge assumptions constructively, and push back when requirements, data quality, or model expectations are not realistic.
Manufacturing experience is strongly preferred, but we are also open to candidates from adjacent industrial, operations, quality, aerospace, semiconductor, supply chain, or equipment-heavy environments who can learn the manufacturing domain quickly.
Summary of Data Science Work in a Manufacturing Environment
A Data Scientist in manufacturing works at the intersection of factory operations, engineering, quality, maintenance, data platforms, and machine learning.
The work is not only about building models. It includes understanding how the plant operates, identifying where data is generated, defining what “normal” and “abnormal” look like, creating reliable features from machine and process signals, validating model outputs against real-world outcomes, and delivering insights that plant teams can act on.
Typical manufacturing data science work includes detecting process drift, identifying abnormal machine behavior, predicting quality issues, improving equipment health visibility, supporting root cause analysis, and helping teams move from reactive firefighting to proactive detection, triage, and prevention.
Success requires technical depth, manufacturing curiosity, practical judgment, and the ability to build solutions that work with messy, incomplete, noisy, and high-frequency industrial data.
Key Responsibilities
Applied Machine Learning & Analytics
Develop machine learning and statistical models to support manufacturing use cases such as anomaly detection, quality prediction, equipment health, process monitoring, throughput improvement, and decision support.
Apply supervised, unsupervised, and semi-supervised learning methods, including classification, regression, clustering, anomaly detection, time-series analysis, statistical process control, and model explainability.
Build anomaly detection solutions using methods such as control limits, isolation forests, clustering, Mahalanobis distance, autoencoders, time-series models, and supervised classification where labeled defects are available.
Develop models for manufacturing use cases such as stamping split detection, weld quality, paint defects, assembly issues, predictive maintenance, bottleneck detection, process optimization, and quality prediction.
Evaluate model performance using appropriate metrics, ground truth definitions, validation strategies, false positive and false negative analysis, and business impact measures.
Identify when data is insufficient, labels are unreliable, ground truth is weak, or a machine learning approach is not appropriate, and communicate those limitations clearly.
Manufacturing Data & Feature Engineering
Analyze real-time and historical factory data from sources such as PLCs, sensors, machines, MES, SCADA, historians, quality systems, maintenance systems, production logs, and enterprise platforms.
Create features from manufacturing signals such as cycle time, pressure, temperature, torque, vibration, current, force, cushion pressure, line speed, JPH, FTT, FRC, scrap, rework, downtime, and fault codes.
Work with noisy, incomplete, high-frequency, or fragmented industrial data to create reliable analytical datasets.
Build features that reflect manufacturing context, including asset hierarchy, station behavior, part flow, process sequence, shift patterns, tool usage, maintenance history, supplier variation, and quality outcomes.
Partner with plant teams and domain experts to understand process behavior, validate assumptions, and determine whether model outputs reflect real operating conditions.
Cloud, Data Pipelines & MLOps
Use cloud data platforms, preferably GCP, to support scalable analytics and machine learning workflows.
Develop and partner with Data Engineering to build data pipelines that ingest, transform, and prepare manufacturing data for analysis, modeling, monitoring, and reporting.
Work with tools such as BigQuery, Cloud Storage, Pub/Sub, Dataflow, Vertex AI, Cloud Run, Cloud Functions, Looker, or similar cloud services.
Support real-time and near-real-time analytics use cases by working with streaming data from MQTT, Kafka, Pub/Sub, Dataflow, or similar event-driven architectures.
Partner with platform and software engineering teams to move models and analytical workflows from prototype to production-ready solutions.
Follow MLOps practices such as experiment tracking, model versioning, model deployment, model monitoring, drift detection, retraining workflows, and production documentation.
Monitor model performance after deployment, including false positives, false negatives, data drift, model drift, latency, uptime, pipeline failures, and changing manufacturing conditions.