Job Description
About Enterpret
At Enterpret, we are building the customer intelligence platform that helps companies like Notion, Canva, Perplexity, Wispr Flow deeply understand their customers and make better product decisions.
We process and analyze massive amounts of unstructured customer feedback across support tickets, surveys, conversations, reviews, and product signals. Behind this experience is a cloud platform that needs to be reliable, secure, scalable, and developer-friendly.
We are looking for a Site Reliability Engineer who is excited not only about infrastructure and reliability, but also about Developer Experience (DevEx), platform engineering, and the future of AI-enabled software development.
This role sits at the intersection of cloud infrastructure, platform tooling, developer productivity, security, cost optimization, and AI-assisted engineering workflows. You will help build the internal platform that enables engineers to move faster while maintaining reliability, governance, and operational excellence.
What You'll Do?
- Manage and improve AWS-based infrastructure and services across our platform.
- Operate and improve containerized workloads running on ECS and EKS.
- Build and maintain CI/CD workflows using GitHub Actions and modern deployment practices.
- Improve cloud security, governance, access controls, and infrastructure compliance practices.
- Design and implement cost-efficient cloud architecture and optimization strategies.
- Build internal tooling, self-service workflows, and platform capabilities that improve developer productivity.
- Improve developer experience through better automation, observability, deployment workflows, and operational tooling.
- Partner closely with engineering teams to reduce friction in how software is developed, tested, deployed, and operated.
- Evaluate and adopt AI-powered engineering tools that improve coding, debugging, documentation, operational workflows, and engineering efficiency.
- Automate repetitive operational tasks using infrastructure tooling, scripting, and AI-assisted workflows.
- Monitor infrastructure health, troubleshoot production issues, and improve overall system reliability.
- Participate in incident response, root cause analysis, and reliability improvements.
- Help standardize infrastructure and platform practices across engineering teams.
- Explore emerging technologies across cloud infrastructure, platform engineering, DevEx, and AI-enabled engineering workflows.
What We're Looking For?
- 3-4 years of experience in Site Reliability Engineering, DevOps, Platform Engineering, or Infrastructure Engineering.
- Strong hands-on experience with AWS services.
- Experience working with ECS and/or EKS.
- Familiarity with Docker and containerized application deployments.
- Experience building CI/CD pipelines using GitHub Actions or similar tooling.
- Experience with Infrastructure as Code tools such as Terraform or CloudFormation.
- Good understanding of networking fundamentals, IAM, security groups, load balancers, and cloud security best practices.
- Comfortable with scripting and automation using Python, Go, or similar languages.
- Strong debugging, ownership, and problem-solving skills.
- Strong collaboration and communication skills.
Developer Experience & AI Platform Mindset
We are especially interested in engineers who think beyond infrastructure management and care deeply about how engineering teams operate.
You should ideally have:
- Interest in Developer Experience (DevEx) and platform engineering.
- Experience improving engineering productivity through tooling, automation, or internal platform capabilities.
- Comfort using AI and LLM-powered tools for coding, debugging, documentation, automation, and operational workflows.
- Curiosity about how AI-assisted engineering workflows can improve developer velocity, reliability, and operational efficiency.
- A mindset of continuously reducing manual work and improving engineering systems through automation.
Nice To Have
- Exposure to ClickHouse OSS or ClickHouse Cloud.
- Experience with observability platforms such as CloudWatch, Grafana, Prometheus, or similar tools.
- Familiarity with AWS cost optimization practices.
- Experience building internal developer platforms, self-service infrastructure tooling, or engineering enablement systems.
- Exposure to AI-powered developer tools and operational automation workflows.
Who You Are?
You enjoy solving infrastructure problems, automating repetitive work, and building systems that make engineers more effective.
You think beyond traditional DevOps and are excited about Developer Experience, platform engineering, and the next generation of AI-enabled software development. You are curious about how AI can improve engineering workflows, incident response, operational efficiency, and software delivery.
You are hands-on, pragmatic, and thrive in fast-moving environments where ownership, learning, and execution matter.
If building reliable systems and enabling great engineers to move faster excites you, we'd love to talk.
Why Enterpret?
- Real problems to solve — The team needs you. This isn't a maintenance role; it's a turnaround.
- AI-native product — Work on a product using LLMs at its core (OpenAI, Anthropic), serving customers who care deeply about what we build.
- Modern stack — Go for performance services, ClickHouse for analytics, AWS (Fargate, DynamoDB, RDS, S3), and coding agents to automate engineering workflows.
- Meaningful ownership — Significant equity in a venture-backed company with real traction.