Back to jobs
Job Description
- Leverage SQL and Python to embed self-service frameworks and automated evaluation systems into developer pipelines, enabling product teams to run standard evaluations autonomously.
- Act as the operational executor for complex, high-risk, and bespoke strategic evaluations, bridging the gap between defining safety quality and enforcing it.
- Partner with cross-functional stakeholders—including engineering teams, policy experts, and launch leadership—to develop and own intake triage and handoff governance protocols.
- Develop, maintain, and execute automated quality rubrics across testing services to ensure actionable results. Drive initiatives to significantly increase the use of automated evaluations and optimize operational resource allocation.
- Work autonomously to identify and solve problems and collaborate effectively within a team to develop comprehensive solutions. This role works with sensitive content or situations and may be exposed to graphic, controversial, and/or upsetting topics or content.
