Back to jobs

Senior Research Scientist, ML Efficiency, Google Research
Posted 1 weeks ago
Job Description
- Advance in algorithms, sampling techniques and optimization to make serving and inference of generative AI models more efficient and flexible.This includes model compression, knowledge distillation and quantization strategies.
- Innovate algorithms and large language model architectures that improve computation efficiency and generalization of training learning models.
- Improve the model deployment pipeline that includes entirely new formulations of pretraining, instruction tuning, reinforcement learning, thinking and reasoning.
- Collaborate with Hardware and Software teams to optimize kernels and inference engines, across different hardware and model architectures.
- Optimize latency, memory bandwidth, and workloads.