Back to jobs
Dahl Consulting

Senior Engineering Manager AI Inference Platform, Distributed Cloud

Posted Today

Job Description

  • Lead, mentor, and grow a high-performing team of systems and ML engineers. Drive a culture of excellence, psychological safety, and continuous learning while guiding career paths and OKRs.
  • Define the technical vision and strategy for enhancing the LLM serving stack, focusing on performance, scalability, and resource efficiency.
  • Oversee the infrastructure and tooling for in-depth performance analysis, profiling, and benchmarking of LLM models on GPU accelerators.
  • Partner closely with Research, SRE, Product, and core library teams to optimize and deploy LLMs globally.
  • Drive the design, implementation, and optimization of advanced serving architectures—including disaggregated serving—while collaborating with core library and kernel partners to eliminate low-level performance bottlenecks, maximize resource utilization, and minimize latency.

See Your Match Score

Sign up and Renata will show you how this job matches your skills and experience.

Get Started Free
Senior Engineering Manager AI Inference Platform, Distributed Cloud at Dahl Consulting | Renata