Job Description
This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Staff Production Operations Engineer based in United States.
This role sits at the intersection of reliability engineering, automation, and operational excellence, supporting large-scale distributed systems that process high volumes of real-time data.
You will be responsible for improving system reliability, streamlining incident management workflows, and building automation that reduces operational overhead across engineering teams.
The position plays a key role in shaping how incidents are detected, managed, and learned from, ensuring faster resolution and continuous improvement across production environments.
You will collaborate closely with engineering, product, and customer-facing teams to maintain high availability and performance standards across global systems.
A strong emphasis is placed on leveraging AI-driven tooling to automate repetitive operational tasks and enhance incident response efficiency.
This is a high-impact role ideal for someone who thrives in fast-paced infrastructure environments and enjoys combining SRE discipline with automation and tooling innovation.
This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Staff Production Operations Engineer based in United States.
This role sits at the intersection of reliability engineering, automation, and operational excellence, supporting large-scale distributed systems that process high volumes of real-time data.
You will be responsible for improving system reliability, streamlining incident management workflows, and building automation that reduces operational overhead across engineering teams.
The position plays a key role in shaping how incidents are detected, managed, and learned from, ensuring faster resolution and continuous improvement across production environments.
You will collaborate closely with engineering, product, and customer-facing teams to maintain high availability and performance standards across global systems.
A strong emphasis is placed on leveraging AI-driven tooling to automate repetitive operational tasks and enhance incident response efficiency.
This is a high-impact role ideal for someone who thrives in fast-paced infrastructure environments and enjoys combining SRE discipline with automation and tooling innovation.
