Back to jobs
Job Description
- Develop and enhance the Kueue open-source project, focusing on core scheduling algorithms, queueing mechanisms, and overall performance.
- Pioneer the use of Agentic AI to assist in Kueue code development and to diagnose complex scheduling issues.
- Implement support for accelerators, ensuring efficient and high-performance scheduling for hardware like TPU7X, TPU8, and Nvidia GB200/GB300, as well as large-scale CPU workloads.
- Innovate on advanced scheduling concepts such as topology-aware scheduling to optimize network locality and elastic workload support for dynamic scaling.
- Integrate Kueue with popular AI/ML frameworks like Pathways and Ray. Engage actively with the Kubernetes and CNCF open-source communities to drive the direction of AI workload scheduling.
