Job Description
At Outreach, we build the technology that powers the world’s leading sales execution platform — and over the last year, we have been moving fast to bring AI to the center of how revenue teams work. We have shipped agents that research accounts, personalize outreach, run meetings, and drive revenue workflows end-to-end. With Ask Outreach, we have built a fully agentic, conversational platform on LangGraph that lets users interact with their Outreach data and workflows in entirely new ways.
As we scale the depth and breadth of our AI platform, quality is not an afterthought — it is foundational. We are looking for a Staff AI Test Engineer who is first and foremost an exceptional quality engineer, and who brings a genuine curiosity and working understanding of how AI and LLM-based systems behave, fail, and improve. If you are passionate about building rigorous test strategies for complex, probabilistic systems at scale, we want to talk to you.
We are seeking a Staff-level engineer to own quality for our GenAI platform and agent ecosystem. This is a high-impact, strategic role where you will define and lead testing practices across a rapidly evolving agentic platform — including the agents themselves, the tools they call, the LangGraph orchestration layer, and the underlying ML pipelines and data flows.
This role requires someone who understands the unique challenges of testing AI systems: outputs are not always deterministic, correctness is often contextual, and traditional pass/fail assertions are insufficient on their own. You will design and implement evaluation frameworks that combine deterministic validation with LLM-based grading, establish quality standards for agent behavior, and partner closely with Data Science, Engineering, and Product teams to make quality a shared discipline.
You will be a senior voice in how we build, ship, and continuously improve AI products at Outreach.
At Outreach, we build the technology that powers the world’s leading sales execution platform — and over the last year, we have been moving fast to bring AI to the center of how revenue teams work. We have shipped agents that research accounts, personalize outreach, run meetings, and drive revenue workflows end-to-end. With Ask Outreach, we have built a fully agentic, conversational platform on LangGraph that lets users interact with their Outreach data and workflows in entirely new ways.
As we scale the depth and breadth of our AI platform, quality is not an afterthought — it is foundational. We are looking for a Staff AI Test Engineer who is first and foremost an exceptional quality engineer, and who brings a genuine curiosity and working understanding of how AI and LLM-based systems behave, fail, and improve. If you are passionate about building rigorous test strategies for complex, probabilistic systems at scale, we want to talk to you.
We are seeking a Staff-level engineer to own quality for our GenAI platform and agent ecosystem. This is a high-impact, strategic role where you will define and lead testing practices across a rapidly evolving agentic platform — including the agents themselves, the tools they call, the LangGraph orchestration layer, and the underlying ML pipelines and data flows.
This role requires someone who understands the unique challenges of testing AI systems: outputs are not always deterministic, correctness is often contextual, and traditional pass/fail assertions are insufficient on their own. You will design and implement evaluation frameworks that combine deterministic validation with LLM-based grading, establish quality standards for agent behavior, and partner closely with Data Science, Engineering, and Product teams to make quality a shared discipline.
You will be a senior voice in how we build, ship, and continuously improve AI products at Outreach.