
Director - Public Cloud Operations - US
Job Description
Joining Amex Tech means discovering and shaping your contribution to something big. Here, you can work alongside talented tech teams and build a unique career with the Powerful Backing of American Express. With a range of opportunities to work with the latest technologies, and a commitment to back the broader engineering community through open source, our mission is to power your success. Because Amex Tech is powered by our technology, our culture, and our colleagues.
The Technology organization enables and accelerates the company’s growth strategies, delivering global capabilities and services in support of Amex’s customers and colleagues, while maintaining 24/7 servicing and availability to ensure an uninterrupted, high-quality customer experience. Technology provides the foundation for everything we do in the company while driving differentiation through building and leveraging innovative technology and data insights.
The American Express Enterprise Cloud team is seeking an experienced infrastructure leader to help build and operate world‑class cloud platforms and infrastructure, supported by integrated CI/CD, observability, and security capabilities.
The Director Infrastructure Engineering - Head of Public Cloud US Operations (AWS, Azure and GCP) is responsible for leading the strategy, execution, and continuous improvement of cloud operations across Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). This role ensures secure, reliable, scalable, and cost-effective cloud environments that support enterprise applications and digital transformation initiatives.
The ideal candidate brings deep technical expertise in cloud infrastructure, a strong operational mindset, and proven leadership experience across financial governance (FinOps), automation, and large‑scale platform operations.
Cloud Operations Leadership
- Lead and manage U.S. public cloud operations across production and non‑production environments for AWS, Azure, and GCP.
- Drive incident, problem, and change management processes to ensure operational stability.
- Ensure high availability, performance, resilience, and operational excellence across cloud platforms.
- Partner closely with cloud service providers on operational performance and resiliency initiatives.
- Collaborate with engineering, product, architecture, and security teams.
- Support enterprise cloud migration and modernization programs.
Cloud Infrastructure & Reliability
- Oversee infrastructure design, deployment, monitoring, and optimization.
- Implement and govern Infrastructure as Code (IaC) using tools such as Terraform.
- Drive Site Reliability Engineering (SRE) principles, including automation and reliability engineering.
- Lead disaster recovery and business continuity strategies.
Security & Compliance
- Partner with cybersecurity teams on vulnerability management and risk mitigation.
- Implement best practices for identity and access management (IAM), encryption, logging, and monitoring.
- Participate in internal and external audits, as required.
Automation & DevOps Enablement
- Champion automation‑first operational models.
- Integrate CI/CD pipelines with cloud infrastructure.
- Reduce manual operational overhead through scripting and tooling.
Financials and Reporting
- Drive cloud cost‑efficiency initiatives across workloads.
- Lead cloud inventory management and usage reporting.
Team Leadership & Development
- Build, mentor, and retain high‑performing cloud operations teams.
- Establish performance metrics and career development plans.
- Foster a culture of accountability, innovation, and continuous improvement.
- Bachelor’s degree in Computer Science, Engineering, or related field (Master’s preferred).
- 10+ years of experience in Platform Engineering & Operations, API Support or Site Reliability Engineering (SRE), with a proven track record of leading teams in managing large-scale cloud infrastructure with a focus on reliability and resilience.
- Deep, hands‑on experience with AWS, Azure, and/or GCP (multi‑cloud experience preferred).
- Strong experience with:
- Infrastructure as Code (e.g., Terraform, CloudFormation)
- Container platforms (e.g., EKS, GKE)
- Monitoring tools (e.g., Datadog, Prometheus, CloudWatch, Azure Monitor)
- CI/CD pipelines (e.g., Jenkins, GitHub Actions, Azure DevOps)
- Strong understanding of cloud networking, security, and architecture.
- Experience managing large-scale and mission-critical production environments.
- Proven experience in financial management and cloud cost optimization.
- Relevant certifications preferred:
- AWS Solutions Architect – Professional
- Azure Solutions Architect Expert
- GCP Professional Cloud Architect
- Experience with observability tools such as Prometheus, ELK and Dynatrace.
- Strong analytical and problem-solving skills, with the ability to troubleshoot complex issues and drive resolution in a fast-paced environment.
- Excellent communication and leadership skills, with the ability to effectively collaborate with cross-functional teams and influence decision-making at all levels of the organization.
Employment eligibility to work with American Express in the United States is required as the company will not pursue visa sponsorship for these positions.
At American Express, our culture is built on a 175-year history of innovation, shared values and Leadership Behaviors, and an unwavering commitment to back our customers, communities, and colleagues. From delivering differentiated products to providing world-class customer service, we operate with a strong risk mindset, ensuring we continue to uphold our brand promise of trust, security, and service.
As part of Team Amex, you’ll experience our powerful backing with comprehensive support for your holistic well-being and many opportunities to learn new skills, develop as a leader, and grow your career. Here, your voice and ideas matter, your work makes an impact, and together, you will help us define the future of American Express.