Job Description
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Principal MLOps Platform Engineer in the United States.
This role sits at the center of building and operating a next-generation AI and MLOps platform designed to support production-grade machine learning and agentic systems at scale. You will be responsible for designing the infrastructure backbone that enables model deployment, observability, orchestration, and cost-efficient runtime operations across cloud environments. The position combines deep cloud engineering, platform architecture, and MLOps expertise, with a strong focus on reliability and automation. You will define how models and LLM-powered services are deployed, monitored, and governed in production. Working across engineering, data, and AI teams, you will ensure seamless integration of ML workflows into scalable, secure, and observable systems. This is a high-impact role where your work directly shapes platform performance, developer experience, and operational efficiency. You will also help establish best practices for cost control, environment management, and production readiness of AI systems.
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Principal MLOps Platform Engineer in the United States.
This role sits at the center of building and operating a next-generation AI and MLOps platform designed to support production-grade machine learning and agentic systems at scale. You will be responsible for designing the infrastructure backbone that enables model deployment, observability, orchestration, and cost-efficient runtime operations across cloud environments. The position combines deep cloud engineering, platform architecture, and MLOps expertise, with a strong focus on reliability and automation. You will define how models and LLM-powered services are deployed, monitored, and governed in production. Working across engineering, data, and AI teams, you will ensure seamless integration of ML workflows into scalable, secure, and observable systems. This is a high-impact role where your work directly shapes platform performance, developer experience, and operational efficiency. You will also help establish best practices for cost control, environment management, and production readiness of AI systems.
