Back to jobs
Google

Technical Program Manager III, Capacity Management, Cloud

Kirkland, WA, USAPosted 6 days ago
onsite

Job Description

  • Lead cross-functional programs related to ML Fleet capacity management, including the design, update, and maintenance of ML Fleet's cluster-level allocation plan of record.
  • Drive the development, implementation, and ongoing maintenance of fleet-wide accelerator and auxiliary resource usage metrics, policies, and governance frameworks.
  • Identify gaps and drive initiatives to improve existing tooling and processes, enhancing the efficiency, agility, and responsiveness of ML capacity allocation and management.
  • Partner with key stakeholders including ML Strategy and Allocation (MLSA), Product Area Resource Management teams (PARMs), capital engineering, supply teams, tooling engineering (e.g., OneFleet, Tpulse, GQM Dev), and system infrastructure SREs (e.g., Spatial Flex, PIE).
  • Manage communications and escalations related to ML resource allocation, performance, and strategic shifts for product areas and partners.

See Your Match Score

Sign up and Renata will show you how this job matches your skills and experience.

Technical Program Manager III, Capacity Management, Cloud at Google | Renata