Back to jobs
Job Description
- Troubleshoot and resolve highly technical issues across the Google Cloud AI/ML portfolio, focusing on customer-reported , deployment failures, model performance degradation and infrastructure-related problems.
- Work directly with customers on their ML deployments (including Generative AI models)to ensure production readiness,high availability.
- Utilize coding and scripting skills (primarily Python) to read,debug, and reproduce customer issues within their ML models (TensorFlow, PyTorch) or deployment environments(Kubernetes, Compute Engine).
- Manage customer problems through effective diagnosis,clear documentation and the development/implementation of new investigation tools to increase diagnostic speed.
- Develop an in-depth understanding of Google Cloud's AI/ML solutions and share this knowledge to upskill the wider global support organization. Participate in an on-call rotation, may include working non-standard hours,nights,or weekends as part of our global 24/7 support model.
