Back to jobs
Google

Software Engineer, TPU Software Systems, Cloud

Posted Yesterday

Job Description

  • Design and maintain TPU supercomputer software across multiple stack layers, ranging from daemons on host machines to network routing rules embedded directly into the TPUs.
  • Develop and manage control software on specialized machines and distributed infrastructure to support the operation of massive collections of networked hardware.
  • Implement robust systems to monitor, deploy, qualify, and service supercomputing systems, ensuring they remain reliable and performant at scale.
  • Engineer software solutions for the reliable scale-out and scale-up of accelerators, specifically tailored to meet the needs of massive-scale machine learning applications.
  • Architect and build software to optimally interconnect TPUs, enabling efficient execution of data parallelism algorithms like ring all-reduce.

See Your Match Score

Sign up and Renata will show you how this job matches your skills and experience.

Get Started Free
Software Engineer, TPU Software Systems, Cloud at Google | Renata