Job Description
Site Reliability Engineer (SRE)
On-site Requirement: Minimum two days per week in Chantilly, VA
Employment Type: Full-time, Salaried
Overview
We're looking for a Site Reliability Engineer (SRE) to help maintain and improve the reliability, performance, and security of our software products and platforms. This role combines operational excellence with engineering rigor, with a strong emphasis on automation and Infrastructure-as-Code. You will be a core part of a team that ensures uptime and system health across our services on both ILS and client networks.
Not your usual government contracting role
❌ This is not...
- a billable role
- "butt in seat" contracting
- a fill-the-timecard exercise (also: no timecards!)
- a random checklist of oddly specific things you must know to get an "upgrade" on contract
✅ This is...
- a full time staff role at ILS, providing our clients with awesome commercial capabilities in their private environments.
- a high impact role where success enables missions and unlocks commercial incentives.
- an opportunity to build using transferrable, state-of-the-industry engineering patterns and skills.
Benefits and Perks
- Company 401(k) contribution: 10% of salary. Immediate vest, no strings.
- $1,800 in your HSA each year.
- Fully paid health, dental, and vision insurance premiums for you and your family.
- Generous time off, 10 company holidays, and additional company down days announced each year.
- Cash incentive plans based on role-related metrics and project objectives.
Work Schedule & Flexibility
- SRE Shifts: We provide 24/7 production services to our clients. SREs rotate duty shifts (3 per week on average) during business hours at client sites. After-hours coverage is shared across the whole team and is on call (not on-site).
- Flexible Work: When not on SRE duty, you'll work remotely with a flexible schedule. Team meetings, whether virtual or in-person, are held during standard business hours and scheduled in advance.
Key Responsibilities
- Maintain uptime, performance, and security of ILS products across internal and client environments
- Leverage automation and Infrastructure-as-Code to manage and scale infrastructure
- Respond to and resolve support requests during assigned shifts, escalating as needed to meet SLAs
- Contribute to:
- Development of SRE Playbooks to automate avoidance, detection, or remediation of operational issues
- Code review for quality and security purposes
- Accreditation and compliance initiatives
- Platform capability enhancements and other technical development tasks
Key Skills
Our technical stack relies on tools like Kubernetes, Terraform, and AWS with custom code generally written in Go or Python.
Engineers with a solid foundation in automation, DevOps, distributed systems, and cloud environments will be well positioned.
Security Requirements
This role requires active US Government security clearance. Candidates must comply with all related obligations and procedural requirements.