Back to jobs

Senior Engineering Manager AI Inference Platform, Distributed Cloud
Posted Today
Job Description
- Lead, mentor, and grow a high-performing team of systems and ML engineers. Drive a culture of excellence, psychological safety, and continuous learning while guiding career paths and OKRs.
- Define the technical vision and strategy for enhancing the LLM serving stack, focusing on performance, scalability, and resource efficiency.
- Oversee the infrastructure and tooling for in-depth performance analysis, profiling, and benchmarking of LLM models on GPU accelerators.
- Partner closely with Research, SRE, Product, and core library teams to optimize and deploy LLMs globally.
- Drive the design, implementation, and optimization of advanced serving architectures—including disaggregated serving—while collaborating with core library and kernel partners to eliminate low-level performance bottlenecks, maximize resource utilization, and minimize latency.