Back to jobs
B

Senior Site Reliability Engineer

São PauloPosted Today
remote

Job Description

At Braze, we have found our people. We’re a genuinely approachable, exceptionally kind, and intensely passionate crew.

We seek to ignite that passion by setting high standards, championing teamwork, and creating work-life harmony as we collectively navigate rapid growth on a global scale while striving for greater equity and opportunity – inside and outside our organization.

To flourish here, you must be prepared to set a high bar for yourself and those around you. There is always a way to contribute: Acting with autonomy, having accountability and being open to new perspectives are essential to our continued success.

Our deep curiosity to learn and our eagerness to share diverse passions with others gives us balance and injects a one-of-a-kind vibrancy into our culture.

If you are driven to solve exhilarating challenges and have a bias toward action in the face of change, you will be empowered to make a real impact here, with a sharp and passionate team at your back. If Braze sounds like a place where you can thrive, we can’t wait to meet you.

WHAT YOU'LL DO

Braze runs one of the largest MongoDB deployments in the world – powering real-time customer engagement for thousands of the world’s leading brands. We process hundreds of billions of data points each month across more than 3.3 billion monthly active users, with MongoDB at the core of how we store, query, and serve that data at scale.

As a Senior SRE on the MongoDB Platform team, your primary mission is to make MongoDB better for Braze – and to do so with the rigor, automation-first mindset, and engineering discipline of a world-class SRE. You won’t just keep the lights on; you’ll architect a more reliable, scalable, and observable MongoDB platform that the entire engineering organization depends on.

Main responsibilities:

Own MongoDB Reliability at Scale

  • Design and operate Braze’s MongoDB infrastructure to meet strict enterprise-grade SLAs, with deep ownership of availability, durability, and query performance
  • Build proactive monitoring and alerting that fires on symptoms – before customers feel impact – with rich MongoDB-specific observability (oplog lag, replication health, lock contention, index hit rates, etc.)
  • Lead capacity planning and sharding strategy as data volumes and query patterns evolve
  • Drive root-cause analysis on MongoDB incidents and translate findings into permanent system improvements

Improve the MongoDB Developer Experience

  • Partner with product engineering teams to review schema designs, index strategies, and aggregation pipelines – catching scalability anti-patterns before they reach production
  • Build self-service tooling, automation, and runbooks that let engineers interact with MongoDB safely and efficiently without needing to page the platform team
  • Define and enforce connection pool sizing, write-concern defaults, and read-preference standards across the fleet

Build and Automate Infrastructure

  • Manage MongoDB cluster lifecycle (provisioning, upgrades, failovers, decommissions) on Kubernetes using the MongoDB Enterprise Kubernetes Operator, with infrastructure defined as code via Terraform and Ansible
  • Develop and maintain automated backup, restore, and point-in-time recovery workflows – tested regularly against real workloads
  • Contribute to internal platform tooling in Ruby and/or Go that reduces operational toil across the SRE organization

Incident Response & On-Call

  • Participate in a PagerDuty on-call rotation with a clear charter: use every quiet shift to eliminate the next page
  • Lead incident retrospectives with a bias toward systemic fixes, automation, and documentation – not blame
  • Maintain and improve runbooks so that any engineer on the team can respond effectively to MongoDB incidents

WHO YOU ARE

Required:

  • 5+ years of experience as a Software Engineer, DevOps Engineer, or Site Reliability Engineer in a production environment
  • Hands-on MongoDB expertise: replica sets, sharding, index design, aggregation pipelines, explain plans, and performance tuning under real load
  • Strong Linux fundamentals and comfort operating at the OS level (disk I/O, memory, networking, process management)
  • Strong programming skills in one or more of: Python, Go, Ruby, or JavaScript – you write automation, not just scripts (JavaScript/Python experience is a plus for MongoDB shell scripting and aggregation pipeline work)
  • Experience with IaC tools: Terraform, Ansible, or equivalent
  • Experience with container orchestration: Docker and Kubernetes
  • A systems thinker who reasons about interfaces, failure modes, edge cases, and cascading effects across the stack
  • Bias toward documentation and asynchronous collaboration across global remote teams

Nice to Have:

  • Experience running MongoDB at multi-terabyte scale or in a sharded topology
  • Familiarity with MongoDB Atlas, Ops Manager, or Cloud Manager
  • Experience with complementary data technologies in Braze’s stack: Redis, Kafka, Postgres
  • Prior work on database platform engineering or database reliability engineering (DBRE) teams

#LI-Hybrid

WHAT WE OFFER

Braze benefits vary by location, and we encourage you to review our specific benefits offerings for each country here. More details on benefits plans will be provided if you receive an offer of employment.

From offering comprehensive benefits to fostering hybrid ways of working, we’ve got you covered so you can prioritize work-life harmony. Braze offers benefits such as:

  • Competitive compensation that may include equity
  • Retirement and Employee Stock Purchase Plans
  • Flexible paid time off
  • Comprehensive benefit plans covering medical, dental, vision, life, and disability
  • Family services that include fertility benefits and equal paid parental leave
  • Professional development supported by formal career pathing, learning platforms, and a yearly learning stipend
  • A curated in-office employee experience, designed to foster community, team connections, and innovation
  • Opportunities to give back to your community, including an annual company-wide Volunteer Week and donation matching 
  • Employee Resource Groups that provide supportive communities within Braze
  • Collaborative, transparent, and fun culture recognized as a Great Place to Work®

ABOUT BRAZE

Braze is the leading customer engagement platform that empowers brands to Be Absolutely Engaging™. Braze helps brands deliver great customer experiences that drive value both for consumers and for their businesses. Built on a foundation of composable intelligence, BrazeAI™ allows marketers to combine and activate AI agents, models, and features at every touchpoint throughout the Braze Customer Engagement Platform for smarter, faster, and more meaningful customer engagement. From cross-channel messaging and journey orchestration to Al-powered decisioning and optimization, Braze enables companies to turn action into interaction through autonomous, 1:1 personalized experiences.

The company has been consistently recognized as a Leader in marketing technology by industry analysts, and was named a G2 “Best of Marketing and Digital Advertising Software Product” in 2026. Braze was also named a 2026 Best Places to Work by Built In, a 2025 America’s Greenest Companies by Newsweek, and a 2025 Fortune Best Workplace in Technology™ by Great Place To Work®. Braze is also proudly certified as a Great Place to Work® in the U.S., the UK, Australia, and Singapore. 

The company is headquartered in New York with offices in Austin, Berlin, Bucharest, Chicago, Dubai, Jakarta, London, Paris, San Francisco, São Paulo, Singapore, Seoul, Sydney and Tokyo.

BRAZE IS AN EQUAL OPPORTUNITY EMPLOYER

At Braze, we strive to create equitable growth and opportunities inside and outside the organization.

Building meaningful connections is at the heart of everything we do, and that includes our recruiting practices. We're committed to offering all candidates a fair, accessible, and inclusive experience – regardless of age, color, disability, gender identity, marital status, maternity, national origin, pregnancy, race, religion, sex, sexual orientation, or status as a protected veteran. When applying and interviewing with Braze, we want you to feel comfortable showcasing what makes you you.

We know that sometimes different circumstances can lead talented people to hesitate to apply for a role unless they meet 100% of the criteria. If this sounds familiar, we encourage you to apply, as we’d love to meet you.

Please see our Candidate Privacy Policy for more information on how Braze processes your personal information during the recruitment process and, if applicable based on your location, how you can exercise any privacy rights.
Senior Site Reliability Engineer at Braze | Renata