Back to jobs
I

Member of Technical Staff, Pre/Mid-Training

San Mateo, USAPosted 3 months ago
Full-timeremote

Job Description

The Role
We seek experienced scientists and engineers with deep expertise in pre- and mid-training large language models. You will advance our diffusion-based LLM models, developing novel training techniques and pushing the boundaries of parallel token generation.

Key Responsibilities
  • Design, develop, and optimize architectures for diffusion-based language models.
  • Implement innovative training objectives and loss functions for discrete diffusion LLMs.
  • Research and implement techniques for controlled text generation and constraint satisfaction.
  • Develop methods for multi-modal integration within the diffusion framework.
  • Improve model efficiency, reduce training time, and optimize inference throughput.

Qualifications
  • BS/MS/PhD in Computer Science or a related field (or equivalent experience).
  • At least 2 years of experience working on ML projects in PyTorch (or equivalent), preferably in a research lab or engineering role.
  • Excellent familiarity with transformers and core LLM concepts (autoregressive pretraining, instruction tuning, in-context learning, KV caching).
  • Familiarity with training and inference in diffusion models.
  • Experience training deep learning models at scale in distributed computing environments.

Preferred Skills
  • Extensive experience training transformer-based language models from scratch.
  • Knowledge of advanced training techniques (mixed precision, gradient accumulation, etc.).
  • Experience with multi-modal learning and cross-modal architectures.
  • Background in optimization theory and neural network architecture design.
  • Experience with LLM serving frameworks like vLLM, SGLang, or TensorRT.

See Your Match Score

Sign up and Renata will show you how this job matches your skills and experience.

Get Started Free
Member of Technical Staff, Pre/Mid-Training at Inception | Renata