
Senior DevOps Platform Engineer
Job Description
Role Summary
Lead the design, automation, and operation of infrastructure powering clinical and medical-imaging workloads across hybrid cloud and on-premises customer-site deployments. You will help contribute to multi-account AWS (Amazon Web Services) architecture, Kubernetes platforms, CI/CD (Continuous Integration / Continuous Delivery) systems, and product deployment automation end-to-end. This is a senior individual-contributor role collaborating with senior platform engineers on DevOps and CI/CD activities.
Key Responsibilities
Cloud Infrastructure & Architecture
• Architect and operate multi-account AWS environments (Organizations, Control Tower, SCPs — Service Control Policies) with strict workload and data isolation between sensitive and non-sensitive data tiers.
• Lead Terraform module design, state management, and drift remediation across production, staging, and validation environments — covering both AWS provisioning and, where applicable, on-premises infrastructure automation.
Kubernetes & Platform Engineering
• Operate production EKS (Elastic Kubernetes Service) clusters: zero-downtime upgrades, autoscaling, CNI (Container Network Interface), and ingress.
• Build and evolve the internal developer platform (IDP — Internal Developer Platform): paved-road templates, self-service namespaces, golden Helm charts, and GitOps with ArgoCD / Flux.
• Design tenancy, network policies, and Pod Security Standards appropriate for regulated workloads.
On-Premises & Customer-Site Deployment
• Design and automate K3s (lightweight Kubernetes) deployments on customer hardware, including NFS (Network File System) backed storage, cross-subnet networking, and restricted or air-gapped environments.
• Build repeatable, scripted site installation and upgrade workflows for clinical customer sites, using Terraform for infrastructure and configuration management tooling (Ansible / PowerShell) for system state.
• Deploy and operate identity and API gateway components (Keycloak, Kong) within customer environments.
• Partner with field engineering and customer support on site cutover, validation, and rollback procedures.
Product Deployment Automation
• Containerized web-based clinical viewer applications: automate Helm chart packaging, versioning, customer-tenant rollouts, and DICOM (Digital Imaging and Communications in Medicine) / PACS (Picture Archiving and Communication System) connectivity validation as part of the release pipeline.
• Windows-based clinical / advanced-visualization applications: automate installer pipelines using MSI (Microsoft Installer) / WiX (Windows Installer XML) / Inno Setup; configuration with PowerShell DSC (Desired State Configuration) or Ansible for Windows; IIS (Internet Information Services) setup; Windows Server golden-image management; and Active Directory (AD) / Group Policy (GPO) integration for hospital domains where required.
• Build repeatable, auditable deployment runs with rollback paths suitable for clinical sites.
CI/CD, GitOps & Release Engineering
• Design pipelines (GitHub Actions / GitLab CI / Jenkins) with mandatory security gates: SAST (Static Application Security Testing), SCA (Software Composition Analysis), container scanning, and IaC (Infrastructure as Code) scanning.
• Implement progressive delivery: canary, blue/green, feature flags; automated rollback on SLO (Service Level Objective) breach.
Work Experience
Required Qualifications
• 5+ years operating production cloud infrastructure.
• Solid AWS expertise across compute, networking, IAM, KMS, VPC (Virtual Private Cloud) peering / Transit Gateway, and PrivateLink.
• Strong Terraform skills (module authoring, state management, environment promotion).
• Production Kubernetes operations (EKS or equivalent): upgrades, networking, RBAC (Role-Based Access Control), and admission control.
• Experience deploying or operating workloads in customer-site or on-premises environments.
• Strong scripting and automation in Python or Go; comfortable in Bash and Git.
• Experience leading or participating in incident response for production systems with patient or regulatory impact.
Preferred Qualifications
• Familiarity with HIPAA Security Rule, BAAs, and PHI handling — helpful but not required; we will train.
• Experience with SOC 2 (System and Organization Controls 2) or similar control environments.
• Windows deployment automation: PowerShell / Ansible for Windows, MSI / WiX packaging, IIS configuration, Active Directory / Group Policy.
• K3s / lightweight Kubernetes for customer-site or edge deployments; NFS storage operations at scale.
• Multi-region active/active or pilot-light DR for clinical systems.
• Service mesh (Istio / Linkerd), Crossplane, Backstage, or other IDP frameworks.
• Cluster autoscaling experience (Karpenter / Cluster Autoscaler).
• Certifications: CKA (Certified Kubernetes Administrator) / CKS (Certified Kubernetes Security Specialist), AWS Solutions Architect Professional, AWS Security Specialty, HCISPP (HealthCare Information Security and Privacy Practitioner), CISSP (Certified Information Systems Security Professional).