Job Description
About Singtel Digital InfraCo – RE:AI
Singtel Digital InfraCo’s RE:AI division is building Asia’s most advanced and sustainable AI infrastructure ecosystem. RE:AI enables enterprises, research institutions, and digital-native businesses to accelerate innovation through responsible, high-performance AI compute and connectivity solutions.
Singtel’s state‑of‑the‑art GPU‑as‑a‑Service (GPUaaS) data centre networking infrastructure. You will support customers running AI and high‑performance computing (HPC) workloads on large‑scale distributed GPU clusters. This role offers opportunities to build and deepen expertise in AI data‑centre networking and HPC cloud platforms within a highly collaborative and performance‑driven environment.
Make an Impact By
- Provide operational support for GPU infrastructure services to meet customer expectations within defined SLAs, including preventive measures to minimise downtime.
- Provide technical network support and guidance to users of GPU‑accelerated systems.
- Monitor, troubleshoot, and enhance network performance in GPUaaS platforms, including introducing automation to detect and resolve network issues.
- Manage relationships and expectations with internal and external stakeholders.
- Design, develop, and operate high‑performance networking solutions for AI and GPU‑as‑a‑Service (GPUaaS) platforms.
- Collaborate on network security initiatives and implementing cybersecurity best practices aligned with industry standards and certifications.
- Coordinate with cross‑functional teams to deliver network solutions efficiently and within agreed timelines.
- Prepare and presenting technical reports, including network optimisation plans, outage reports, and SLA reports for management and customers.
- Contribute to the design of tailored high‑performance network solutions for AI and GPU cloud computing environments.
- Conduct high‑performance network benchmarking and staying informed on advancements in networking and GPU technologies.
- Participate in scheduled or on‑call support outside standard working hours, including nights, weekends, or public holidays, as required.
Skills for Success
- Diploma/Degree or equivalent qualification in Computer Science, Information Technology, Network Engineering, or a related discipline; relevant networking and Linux certifications are advantageous.
- Strong understanding of networking fundamentals including TCP/IP, VLANs, and subnetting.
- Experience deploying and troubleshooting advanced routing protocols and network architectures such as VXLAN, EVPN, and BGP.
- Experience with network operating systems and platforms such as Cumulus, Arista, Ubuntu, Proxmox, and Red Hat.
- Hands‑on experience with network monitoring, automation, and configuration tools (e.g. Zabbix, NetBox, Ansible, Python scripting, CI/CD pipelines).
- Customer‑focused mindset with strong collaboration and stakeholder management skills.
- Effective verbal, written, and presentation skills in English.
- Strong analytical, problem‑solving skills and ability to work independently while contributing to team objectives.
Desirable Qualifications
- Experience with Linux systems, hypervisors, storage technologies (e.g. NFS, object storage), and infrastructure‑as‑code practices.
- Understanding of how AI and HPC workloads interact with high‑performance networking infrastructure.
- System‑level experience with GPU‑accelerated networking environments.
- Understanding of cloud architectures (IaaS, PaaS), GPU architecture, and NVIDIA GPU platforms.
Your Career Growth Starts Here. Apply Now!