Back to jobs

Senior Staff Research Engineer, DeepMind
Mountain View, CA, USAPosted 2 weeks ago
onsite
Job Description
- Construct quantitative benchmarks and automated evaluation frameworks (including LLM-as-a-judge) to measure agent capabilities in reasoning, planning, and tool use.
- Create and optimize data mixes extracted from user feedback for training, fine-tuning agents to enhance performance on specific tool-use tasks.
- Analyze agent behavior to identify failure modes, edge cases, and performance bottlenecks, turning these insights into actionable improvements.