Back to jobs
Dahl Consulting

Software Engineer, TPU Software Systems, Cloud

Posted 2 days ago

Job Description

  • Design and maintain TPU supercomputer software across multiple stack layers, ranging from daemons on host machines to network routing rules embedded directly into the TPUs.
  • Develop and manage control software on specialized machines and distributed infrastructure to support the operation of massive collections of networked hardware.
  • Implement robust systems to monitor, deploy, qualify, and service supercomputing systems, ensuring they remain reliable and performant at scale.
  • Engineer software solutions for the reliable scale-out and scale-up of accelerators, specifically tailored to meet the needs of massive-scale machine learning applications.
  • Architect and build software to optimally interconnect TPUs, enabling efficient execution of data parallelism algorithms like ring all-reduce.

See Your Match Score

Sign up and Renata will show you how this job matches your skills and experience.

Get Started Free
Software Engineer, TPU Software Systems, Cloud at Dahl Consulting | Renata