Back to jobs

Systems Engineer, Site Reliability Engineering, Customer Fabric Networks
Posted Yesterday
Job Description
- Improve the whole life-cycle of services from inception and design, through deployment, operation, and refinement.
- Manage support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews.
- Guide team members on managing availability and performance of mission-critical services, building automation to prevent problem recurrence, and developing automated responses for non-exceptional service conditions.
- Maintain services once they are live by measuring and monitoring availability, latency, and overall system health, while leading sustainable incident response and blameless postmortems.
- Scale systems sustainably through mechanisms like automation and evolve systems by driving changes that improve reliability and velocity.