
Software Engineer- ModelZoo
Job Description
Exp: 2-3 years
We're looking for a skilled and motivated Machine Learning Software to join our team. The ideal candidate will have a solid foundation in deep learning and a strong interest in optimizing and deploying ML models on specialized hardware. This role involves implementing model optimizations, with a particular focus on quantization, to improve the performance of machine learning inference on target platforms.
Key Responsibilities
- Model Porting & Deployment: Port and deploy deep learning models from frameworks like PyTorch and TensorFlow to proprietary or commercial ML accelerator hardware platforms.
- Performance Optimization: Analyze and improve the performance of ML models for target hardware, focusing on latency and throughput.
- Quantization: Contribute to model quantization efforts (e.g., INT8) to reduce model size and accelerate inference while maintaining model accuracy.
- Profiling & Debugging: Use profiling tools to identify and fix performance bottlenecks in the ML inference pipeline on the accelerator.
Required Qualifications
Technical Skills:
- Proficiency in deep learning frameworks such as PyTorch and TensorFlow.
- Hands-on experience with deploying and optimizing models on GPUs or other specialized accelerators.
- Some experience with model quantization (Post-Training Quantization).
- Strong proficiency in C++ and Python.
- Experience with GPU programming models like CUDA/cuDNN is a plus.
- Familiarity with ML inference engines and runtimes (e.g., TensorRT, OpenVINO, TensorFlow Lite).
- Foundational understanding of computer architecture principles.
- Version Control: Proficient with Git and collaborative development workflows.
- Education: Bachelor's or Master's degree in Computer Science, Electrical Engineering, or a related field.
Preferred Qualifications
- Knowledge of hardware-aware model design.
- Familiarity with compiler technologies for deep learning.
- Experience with real-time or embedded systems.
- Knowledge of cloud platforms (AWS, GCP, Azure).
- Experience with CI/CD pipelines for ML models.