Member of Technical Staff, Kernels

San Mateo, USAPosted 3 months ago

Full-timeremote

Job Description

The Role

We're looking for engineers and scientists to design, optimize, and maintain the compute foundations that power large-scale language model training and inference. You will develop high-performance ML kernels, enable efficient low-precision arithmetic, and improve the distributed compute stack that makes training and serving large models possible.

Key Responsibilities

Design and implement custom ML kernels (CUDA, CuTe, Triton) for core dLLM operations such as attention, matrix multiplication, gating, and normalization, optimized for modern GPU architectures.
Design compute primitives to reduce memory bandwidth bottlenecks and improve kernel efficiency.
Contribute to infrastructure stability and scalability, ensuring reproducibility, consistency across precision formats, and high utilization of compute resources.

Qualifications

BS/MS/PhD in Computer Science, Engineering, or a related field (or equivalent experience).
Proficiency in CUDA, CuTe, Triton, or other GPU programming frameworks.
Understanding of ML frameworks (PyTorch, TensorFlow) from a systems perspective.
Background in performance optimization and profiling of ML systems.
Experience implementing low-precision formats (FP8, INT8, block floating point) or contributing to related compiler stacks (XLA, TVM).
Familiarity with distributed training techniques (data parallel, model parallel, pipeline parallel).
Proficiency in Python and at least one systems programming language (C++/Rust/Go).
Experience with containerization (Docker), orchestration (Kubernetes), and CI/CD pipelines.

Preferred Skills

Experience building and maintaining large-scale language models with tens of billions of parameters or more.
Experience with distributed systems and cloud computing platforms (AWS/GCP/Azure).
Familiarity with distributed frameworks such as PyTorch/XLA, DeepSpeed, Megatron-LM.
Prior contributions to open-source deep learning infrastructure such as PyTorch, DeepSpeed, or XLA.

See Your Match Score

About Inception

More jobs at Inception

Member of Technical Staff, Security Engineering

San Mateo, USA

Member of Technical Staff, Forward Deployed AI Engineer

San Mateo, USA

Social Media Intern

San Mateo, USA

Member of Technical Staff, Software Engineer

San Mateo, USA

Member of Technical Staff, Backend, LLM Applications

San Mateo, USA

Member of Technical Staff, Full Stack, LLM Applications

San Mateo, USA

Similar roles

Cast Member - Seasonal

Cineplex · Toronto, Canada

Part time Cast Member Northgate

Cineplex · Winnipeg, Canada

Restaurant Team Member, Evening Shift - Unit 891

Wab · 215 N Interstate 45 Service Rd Hutchins TX 75141-4

Team Member 6am-2pm

Circle K · Store 2708935 Albuquerque NM

Team Member

Circle K · Store 2743557 Carmel IN

Lead Team Member

Circle K · Store 4705721 Toledo OH