
Principal SDE - Cloud, Platform & Agentic AI
Job Description
Enterprise Architecture & Platform Leadership
Define and evolve reference architectures for distributed, low‑latency, high‑TPS systems deployed on AWS and Kubernetes (EKS).
Serve as a final technical escalation point for complex architectural and system design decisions across multiple teams.
Lead architectural trade‑off decisions spanning compute, networking, data, resiliency, observability, and cost—with explicit SLAs/SLOs.
Own and evangelize non‑functional requirements (latency, throughput, availability, scalability, operability, security, cost efficiency).
Drive architectural consistency while enabling team‑level autonomy through clear patterns and guardrails.
Deep Hands‑On Engineering & Validation
Design and review critical execution paths in Java‑based microservices (Spring Boot, gRPC, REST).
Build sophisticated POCs and spike solutions to de‑risk architecture decisions and emerging technologies before broad adoption.
Perform advanced production debugging across application, JVM, Kubernetes, and AWS layers.
Model “prototype‑first” engineering—validating ideas experimentally before standardization.
Remain hands‑on enough that designs are grounded in real operational constraints.
Cloud & Kubernetes Expertise (AWS / EKS)
Architect and guide large‑scale AWS environments leveraging:
EKS, EC2, ALB/NLB, VPC, IAM, Auto Scaling
Logging, metrics, distributed tracing, and alerting
Define Kubernetes platform patterns:
Pod and node design
HPA and autoscaling strategies
Multi‑AZ and resilience models
Safe rollout, rollback, and canary strategies
Influence security‑first, cost‑aware, and operationally efficient cloud designs at scale.
Agentic AI & AI‑Augmented Engineering
Architect and prototype Agentic AI–driven engineering systems embedded into the SDLC.
Drive adoption and safe usage of GenAI tools such as GitHub Copilot and custom AI agents to:
Accelerate development and refactoring
Improve code quality, reviews, and test coverage
Reduce cognitive load for engineers
Experiment with AI‑enabled workflows across:
Design → Build → Test → Deploy → Operate
Partner with governance and security stakeholders to ensure responsible, compliant AI usage.
Engineering Excellence, Standards & Mentorship
Define and evolve engineering standards, reference implementations, and best practices across the organization.
Perform deep architectural and code reviews for high‑impact systems.
Mentor senior and staff engineers; coach teams on performance, cloud‑native design, and system thinking.
Influence technical direction without authority through credibility, clarity, and results.
Work Experience
12+ years of hands‑on software engineering experience
Expert‑level Java experience with large‑scale microservices platforms
Proven success designing and operating high‑throughput, low‑latency systems
Advanced knowledge of:
Distributed systems
Concurrency, performance tuning, and JVM internals
Fault tolerance, resiliency patterns, and graceful degradation
Cloud & Platform Engineering
Extensive production experience with Kubernetes (EKS) at scale
Deep expertise in AWS networking, security, and scaling patterns
Experience designing cloud‑native, services‑based or event‑driven platforms
Strong operational mindset (observability, incident response, reliability)
AI & Modern Engineering Practices
Hands‑on experience applying Generative AI / Agentic AI in real engineering workflows
Active usage of GitHub Copilot or equivalent AI tooling
Experience integrating AI into developer platforms or SDLC automation (preferred)
Technical Leadership
Recognized ability to influence architecture across teams and platforms
Experience serving as a go‑to expert for the most complex technical problems
Strong communication skills—able to clearly articulate trade‑offs to engineers and leadership alike
Preferred Qualifications
Experience in payments, fintech, or large‑scale transaction processing systems
gRPC, async messaging, or streaming platforms (Kafka, etc.)
Strong observability and SRE‑aligned practices
Proven track record modernizing legacy systems to cloud‑native architectures