
Infrastructure Reliability Engineer
Job Description
THE COMPANY:
STACK INFRASTRUCTURE (STACK) provides digital infrastructure to scale the world’s most innovative companies. We are an award-winning industry leader in building, owning, and operating highly efficient, cost-effective wholesale, colocation, and cloud data centers. Each of our national facilities meets or exceeds the highest industry standards in all operational categories of availability, security, connectivity, and physical resilience.
STACK offers the scale and geographic reach that rapidly growing hyperscale and enterprise companies need. The world runs on data. Data runs on STACK.
THE POSITION:
STACK is looking for an Infrastructure Reliability Engineer with subject matter expertise in electrical systems who will act as a key member of STACK’s Critical Operations team. This position will play a vital role in ensuring the ongoing performance, resiliency, and evolution of infrastructure systems across STACK’s portfolio. This role requires deep technical fluency in data center power and cooling systems, a forensic mindset for failure analysis, and a proactive approach to risk reduction. Responsibilities include but are not limited to:
Lead deep-dive investigations and RCAs for electrical infrastructure failures, including UPS systems, switchgear, breakers, relays, generators, grounding systems, STS behavior, VFD interactions, controls, and power quality disturbances
Evaluate electrical system performance under abnormal, fault, or degraded conditions (e.g., grounding faults, harmonics, transient events, protective coordination, voltage distortion, transfer events) to identify systemic vulnerabilities
Engage OEMs and vendors to challenge technical assumptions and advocate for long-term improvements
Support the evolution of maintenance standards and asset strategy for high-risk or complex systems (e.g., power distribution, cooling)
Collaborate with Workforce Development to enhance technical training for site teams based on lessons from event investigations
Contribute to availability reporting, event response improvement, and risk trend monitoring to ensure SLA commitments are met
Inform and influence the design review and turnover process by identifying gaps in infrastructure handoffs, system limitations, or commissioning practices
Develop system-level failure mode mitigation strategies that improve uptime performance and reduce repeat incidents
Partner with Operations, Engineering and Construction to review electrical design assumptions, protective schemes, equipment compatibility, and commissioning practices to identify long-term reliability risks prior to or following operational events
THE DETAILS:
Location: Manassas or Sterling, VA, Portland, OR, Chicago (CHI), or Dallas-Fort Worth (DFW)
Benefits: Healthcare, Dental Care, Vision Insurance, Life Insurance, Paid Time Off, Paid Leave Programs
Travel: 25% domestically
Must be eligible to work in the United States
Must pass a comprehensive background screening
MUST-HAVE QUALIFICATIONS:
5–8 years of experience in critical infrastructure environments (e.g., data centers, substations, power generation, or utility systems)
Strong technical fluency in mission-critical electrical systems, including power distribution architecture, UPS systems, generators, grounding methodologies, protective relays, switchgear, controls integration, and power quality analysis
Experience analyzing electrical failures through waveform data, event logs, relay coordination, commissioning findings, or forensic troubleshooting
Working knowledge of electrical system design intent versus operational field realities, including maintainability, equipment compatibility, and fault response
Hands-on experience with root cause analysis and reliability methodologies (e.g., FMEA, RCM)
Demonstrated ability to work across disciplines (Ops, Eng, Vendors, Construction) to resolve complex technical issues
Expertise with commissioning (Cx) and infrastructure design review processes
Ability to analyze performance data and translate findings into practical improvements
Bachelor's degree in Engineering or equivalent experience with high technical competency
THIS MIGHT BE RIGHT FOR YOU IF:
You are a strong communicator, you are persuasive and clear, blending analytics with experience in decision-making.
You do not get flustered easily. You can juggle multiple priorities while balancing urgent requests with shifting timelines and deliverables.
You are a team builder. You take the time to understand and develop the strengths of your resources while formulating long-term plans for the growth and success of the team.
You are naturally curious and driven toward continual improvement. While you celebrate your successes, you take time to review and analyze campaigns for future learning.
WHY STACK?
We offer a competitive compensation package with strong benefits, including medical, dental, and vision insurance, a 401K program, flexible spending accounts – even a cell phone subsidy.
We foster a culture of appreciation, including peer-to-peer recognition and rewards programs.
Fun is part of our DNA, with events, game nights, happy hours, and barbecues.
We’re growing – this is a great time to join and make an impact!
STACK is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity and expression, age, national origin, mental or physical disability, genetic information, veteran status, or any other status protected by federal, state, or local law
Note to external agencies: We are not accepting any blind submissions or resumes/cvs from recruitment agencies. Any candidates sent to STACK Infrastructure, Inc. will not be accepted or considered as a submission without a signed agreement in place. Fees will not be paid in the event a candidate submitted by a recruiter without an agreement in place is hired; such resumes will be deemed the sole property of STACK Infrastructure, Inc.