Back to jobs
Dahl Consulting

Software Engineer III, ML Networking

Posted 1 weeks ago

Job Description

  • Analyze the networking issues associated with the next generations of GPU hardware, and design, build, and deploy whatever is needed to make them work optimally in our data centers.
  • Achieve workload optimal performance NVIDIA Collective Communications Library (NCCL) + Graphics Processing Unit (GPU).
  • Compile a comprehensive analysis of performance across different GPU and network generations.
  • Determine how customers' ML models will evolve once we have 72 node-NVLink domains.
  • Execute full stack optimization for ML networking performance on Google's infrastructure, this spans a wide range, from kernel optimization, user space communication libraries.

See Your Match Score

Sign up and Renata will show you how this job matches your skills and experience.

Get Started Free
Software Engineer III, ML Networking at Dahl Consulting | Renata