
Cloud Reliability & Recovery Engineer
Job Description
This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Cloud Reliability & Recovery Engineer based in India.
This role sits at the core of large-scale cloud resilience engineering, focused on ensuring critical systems remain highly available, fault-tolerant, and recoverable under any disruption. You will design and operate advanced AWS-based disaster recovery and business continuity architectures across multi-region environments. The position requires deep hands-on engineering expertise in cloud infrastructure, automation, and reliability practices, with a strong emphasis on Kubernetes, Infrastructure as Code, and CI/CD-driven operations. You will work closely with security, infrastructure, and application teams to define and enforce recovery strategies aligned with strict RTO/RPO objectives. This is a highly technical role where you will build automated DR systems, validate resiliency through chaos engineering, and continuously improve platform stability. The environment is fast-paced, engineering-driven, and focused on measurable reliability outcomes at enterprise scale.
This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Cloud Reliability & Recovery Engineer based in India.
This role sits at the core of large-scale cloud resilience engineering, focused on ensuring critical systems remain highly available, fault-tolerant, and recoverable under any disruption. You will design and operate advanced AWS-based disaster recovery and business continuity architectures across multi-region environments. The position requires deep hands-on engineering expertise in cloud infrastructure, automation, and reliability practices, with a strong emphasis on Kubernetes, Infrastructure as Code, and CI/CD-driven operations. You will work closely with security, infrastructure, and application teams to define and enforce recovery strategies aligned with strict RTO/RPO objectives. This is a highly technical role where you will build automated DR systems, validate resiliency through chaos engineering, and continuously improve platform stability. The environment is fast-paced, engineering-driven, and focused on measurable reliability outcomes at enterprise scale.