Back to jobs
Dahl Consulting

Staff Software Engineer, Gemini Evals, GenAI, DeepMind

Posted 2 weeks ago

Job Description

  • Design and optimize distributed evaluation execution engines capable of orchestrating large volumes of inference steps across TPU and Google compute unit (GCU) pools with high throughput and low latency.
  • Build foundational abstractions to evaluate complex LLM agent loops, tool use, and automated LLM-as-a-judge rating systems.
  • Design error classification, automated retry policies, and observability dashboards to maintain strict service level objective (SLOs) for evaluation pipeline success rates.
  • Partner closely with GDM research scientists and Data Science teams to anticipate frontier model evaluation requirements and translate them into elegant infrastructure solutions.
  • Mentor fellow engineers, set high standards for code quality (Python in Google3), and advocate testing and system design practices.

See Your Match Score

Sign up and Renata will show you how this job matches your skills and experience.

Get Started Free
Staff Software Engineer, Gemini Evals, GenAI, DeepMind at Dahl Consulting | Renata