Back to jobs
Job Description
- Interpret our collection of automated and human quality metrics as indicators of overall product health, identifying high-impact headroom opportunities — for example, combining autorater scores and user telemetry to pinpoint where the Gemini agent needs improvement.
- Advocate for a culture of metric-informed decision-making, experimentation, and high-quality data modeling.
- Act as a go-to expert within the team on specific data science methodologies related to AI evaluation.
- Build and prototype analysis and business cases iteratively to provide insights at scale. Develop comprehensive knowledge of Google data structures and metrics, advocating for changes where needed for product development.
- Stay abreast of the latest advancements in AI evaluation, data science, and agentic AI, and apply them to improve our team's practices.
