Back to jobs

Site Reliability Manager, Data center Networking, SRE
Posted Yesterday
Job Description
- Build a cohesive, mission-first culture across locations. Scale your leadership by empowering trusted Tech Leads and domain experts, while actively prioritizing work to ensure sustained high-performance and on-call health.
- Act as the ultimate execution owner for the team's strategic efforts. Focus on drastically improving Mean Time to Detect (MTTD) and Mitigation (MTTM) for incidents through advanced signaling, tooling, and integrating new signals into auto-mitigation systems.
- Set the bar for developer excellence across the SDN ecosystem. Influence the design and rollout of new network products (NPIs) to ensure they are introduced safely, deliver high reliability to GCP customers, and preserve system simplicity.
- Partner closely with PLANET and sibling SRE shards to define and monitor Network Service Level Objectives (SLOs), co-own blameless postmortems, and establish end-to-end repair coverage for network infrastructure.