
Site Reliability Engineer (Global) - TikTok Server Arch
Job Description
This position is with TikTok's Stability Assurance Team. The team is responsible for ensuring that the services provided by TikTok are highly reliable with low-latency. Reliability assurance is complex and systematic for any massive application system and the team focuses on optimizing the application architecture from end to end; driven by data analysis, with automatic and intelligent failure recovery.
Job Responsibilities:
- Ensure the online stability of TikTok and improve product SLA through systematic disaster recovery abilities, standardized emergency mechanisms, and intelligent analysis.
- Identify system risks and promote governance through comprehensive and multi-perspective quality data.
- Establish TikTok's unified standards and specifications, design and develop a one-stop operation platform, and enhance efficiency across multiple fields.
- Collaborate closely with developers to implement best practices in SRE.
Minimum Qualifications:
- Bachelor's degree or above in a computer-related field 2.Solid foundational knowledge of computer software; understanding of Linux operating systems, storage, network IO, and related principles. 3.Ability to solve problems systematically, strong communication skills, and a sense of ownership.
Preferred Qualification
- 3+ years relevant work experience from a large-scale internet business