
Cloud Reliability & Recovery Engineer
Job Description
This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Cloud Reliability & Recovery Engineer based in India.
This is a senior, hands-on cloud engineering role focused on building and maintaining highly resilient, always-available AWS environments. You will design and operate large-scale disaster recovery (DR) and business continuity (BCP) frameworks that ensure critical systems remain operational even during major disruptions. The role sits at the intersection of SRE, infrastructure engineering, and incident response, with a strong emphasis on automation, fault tolerance, and cloud-native architecture. You will work extensively with Kubernetes, Terraform, and AWS-native resilience services to engineer multi-region failover and recovery strategies. The environment is fast-paced, security-conscious, and highly collaborative, involving close partnership with infrastructure, security, and application teams. Your work will directly reduce downtime risk and strengthen global service reliability across mission-critical systems.
This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Cloud Reliability & Recovery Engineer based in India.
This is a senior, hands-on cloud engineering role focused on building and maintaining highly resilient, always-available AWS environments. You will design and operate large-scale disaster recovery (DR) and business continuity (BCP) frameworks that ensure critical systems remain operational even during major disruptions. The role sits at the intersection of SRE, infrastructure engineering, and incident response, with a strong emphasis on automation, fault tolerance, and cloud-native architecture. You will work extensively with Kubernetes, Terraform, and AWS-native resilience services to engineer multi-region failover and recovery strategies. The environment is fast-paced, security-conscious, and highly collaborative, involving close partnership with infrastructure, security, and application teams. Your work will directly reduce downtime risk and strengthen global service reliability across mission-critical systems.