
AI Research Engineer (Model Compression & Quantization)
Job Description
This position is posted by Jobgether on behalf of a partner company. We are currently looking for an AI Research Engineer (Model Compression & Quantization) in India.
This role sits at the forefront of efficient AI systems research, focusing on making large-scale multimodal models practical for real-world deployment. You will work on advancing state-of-the-art techniques in model compression, enabling LLMs and vision-language models to run efficiently on resource-constrained devices such as mobile and edge hardware. The position combines deep research with hands-on engineering, requiring you to design and optimize pipelines that reduce memory usage, latency, and compute cost without sacrificing model performance. You will explore and implement techniques such as quantization, pruning, and knowledge distillation, contributing directly to scalable AI infrastructure. Operating in a highly research-driven and experimental environment, you will collaborate with AI engineers and researchers to push the boundaries of efficient multimodal intelligence. This is a high-impact role for someone passionate about both cutting-edge AI research and real-world deployment constraints.
This position is posted by Jobgether on behalf of a partner company. We are currently looking for an AI Research Engineer (Model Compression & Quantization) in India.
This role sits at the forefront of efficient AI systems research, focusing on making large-scale multimodal models practical for real-world deployment. You will work on advancing state-of-the-art techniques in model compression, enabling LLMs and vision-language models to run efficiently on resource-constrained devices such as mobile and edge hardware. The position combines deep research with hands-on engineering, requiring you to design and optimize pipelines that reduce memory usage, latency, and compute cost without sacrificing model performance. You will explore and implement techniques such as quantization, pruning, and knowledge distillation, contributing directly to scalable AI infrastructure. Operating in a highly research-driven and experimental environment, you will collaborate with AI engineers and researchers to push the boundaries of efficient multimodal intelligence. This is a high-impact role for someone passionate about both cutting-edge AI research and real-world deployment constraints.