Ops / SRE Support Engineer – VLabs - Manager
Job Description
Company Description
Vialto Partners is a market leader in Global Mobility Services. Our purpose is to “Connect the World.” We are unique and the only stand-alone global mobility business. This presents a rare opportunity for our clients, stakeholders and colleagues.
Our teams help companies streamline and effectively manage their global mobility programs in a cost-efficient and compliant manner. Our services focus on providing cross-border compliance and risk assessment for tax, immigration, business travel, rewards and compensation, and remote work.
Working at Vialto Partners is about getting the chance to be part of a global and dynamic team. Globally, Vialto Partners has over 8,000 staff worldwide and continues to grow. You will work with clients from a range of industries and different geographical locations. We believe in connecting the world and supporting our colleagues to do the same in their careers by undertaking assignments and opportunities globally that broaden their skills and ultimately benefit our clients.
About Vialto Labs (VLabs)
Vialto Labs (VLabs) is responsible for redesigning how work is delivered in the tax and immigration service lines, as well as driving operational efficiency across Vialto’s functional areas using AI. The team builds and deploys novel AI-enabled solutions that directly improve productivity and increase delivery quality for our clients. VLabs is accountable for rapidly turning innovative experiments into production-ready deliverables at scale and embedding them into day-to-day operations. This team focuses on the highest-impact workflows, creating standardized, repeatable capabilities that can be deployed globally. Operating with a mandate for speed and measurable outcomes, VLabs works alongside service line, product, and platform leaders.
About the Role
We are seeking an experienced AI Architect to lead the design, development, and governance of enterprise-grade AI and AIOps solutions across cloud-native environments. This role will define the architecture for Generative AI, agentic systems, and intelligent operations platforms on Microsoft Azure, ensuring scalability, reliability, security, and responsible AI adoption.
The ideal candidate combines deep expertise in GenAI, Agentic Harness, Agentic Engineering, AI/ML systems, distributed architectures, observability, and cloud engineering, with hands-on experience building production-grade GenAI and AIOps solutions. You will play a critical role in shaping AI strategy, establishing architectural standards, and enabling teams to deliver high-impact, enterprise AI capabilities. You will be able to think at system and platform level, not just application level, translate complex AI capabilities into scalable enterprise solutions whilst balancing innovation with reliability, security, and governance.
Key Responsibilities
AI & Solution Architecture
Define end-to-end architecture for Generative AI, agentic systems, and AIOps platforms
Design scalable, secure, and resilient AI systems across cloud and hybrid environments
Establish reference architectures, design patterns, and best practices for AI systems
Agentic & GenAI Systems
Architect multi-agent systems, orchestration workflows, and tool integration frameworks
Define patterns for RAG, retrieval pipelines, and knowledge integration
Design evaluation frameworks, guardrails, and Responsible AI controls
Guide implementation of agent frameworks and workflow orchestration
Design patterns for implementation with Agentic Harness
AIOps & Observability
Architect intelligent operations platforms leveraging logs, metrics, traces, and events
Define strategies for anomaly detection, alert correlation, and automated remediation
Establish observability standards using Open Telemetry (OTEL) and modern monitoring tools like Tempo, Loki, Prometheus, Grafana, etc.
Cloud & Platform Engineering
Lead architecture for AI solutions on Microsoft Azure
Design cloud-native systems using containers (Docker, Kubernetes)
Define data architecture across PostgreSQL, Cosmos DB, SQL Server, and telemetry pipelines
Engineering & Delivery Enablement
Guide teams on backend architecture using Python and FastAPI
Define CI/CD, DevOps, and automation strategies using ADO and GitHub
Establish standards for testing, evaluation, and performance optimization of AI systems
Security, Governance & Responsible AI
Define enterprise standards for AI security, privacy, and compliance
Implement Responsible AI practices including:
Prompt safety
Data protection
Access control
Model governance
Ensure auditability and reliability of AI-driven systems
Operational Excellence
Drive improvements in MTTR, MTTD, reliability, and operational efficiency
Define SLIs, SLOs, and error budgets for AI-powered services
Architect automation for incident response and root-cause analysis
Leadership & Collaboration
Partner with SRE, platform, data, and application teams
Mentor engineers and guide architectural decision-making
Influence AI strategy and roadmap across the organization
Apply strong communication and stakeholder management skills
Required Qualifications / Background
10+ years of experience in software engineering, cloud architecture, or platform engineering
3+ years designing and delivering AI/ML or GenAI systems in production
Strong expertise in Python and backend system design
Experience architecting solutions on Microsoft Azure
Deep understanding of:
Distributed systems
Cloud-native architecture
Observability and SRE principles
Experience with Docker, Kubernetes, and microservices architecture
Strong experience with APIs, service design, and system integration
Experience designing agentic AI systems and orchestration frameworks
Experience with evaluation frameworks and AI system quality measurement
Strong understanding of AIOps, telemetry pipelines, and operational intelligence
Experience with logs, metrics, traces, and event correlation
Expertise in secure AI system design and governance
Experience with CI/CD, DevOps, and automation practices
Preferred Qualifications / Background
Experience with:
Azure AI services, Azure Monitor, and observability platforms
Open Telemetry instrumentation, Tempo, Grafana,
Evals with Arize Phoenix
RAG, vector databases, and retrieval systems
Multi-agent systems and advanced orchestration
Familiarity with:
MCP, tool calling, plugins, and reusable agent capabilities
OCR, document intelligence, and unstructured data processing
Experience with KQL, SQL, or analytics query languages
Exposure to GenAI-assisted development workflows
Additional Information:
Location: Bangalore
We are an equal opportunity employer that does not discriminate on the basis of any legally protected status.
Please note, AI is used as part of the application process.