WHAT YOU'LL DO
Platform Infrastructure Engineers (PIEs) specializing in MongoDB are responsible for managing, maintaining, and evolving the infrastructure required to run MongoDB at scale on Kubernetes. PIEs ensure that MongoDB operates as a reliable, performant, and developer-friendly service, enabling seamless integration with applications and supporting Braze’s massive data operations.
They apply sound engineering principles, operational discipline, and advanced automation to build and maintain the core infrastructure for MongoDB, ensuring it supports Braze’s 3.3 billion monthly active users and the hundreds of billions of data points processed every month. PIEs collaborate with other engineering teams to make MongoDB scalable, self-serviceable, and aligned with Braze’s broader platform goals.
Responsibilities:
Design and Manage MongoDB Infrastructure
- Build, optimize, and manage MongoDB clusters on Kubernetes, ensuring they meet scalability, availability, and performance requirements
- Develop automation frameworks for provisioning, upgrading, scaling, and maintaining MongoDB clusters
- Design architectures that support seamless MongoDB operations, including multi-region deployments, sharded clusters, and replica sets
- Create and optimize storage configurations, resource allocation, and networking to ensure the highest MongoDB performance
Ensure MongoDB Reliability & Performance
- Implement high-availability strategies for MongoDB, including automated failovers, backups, and disaster recovery
- Collaborate with database engineers, Platform Software Engineers, and product teams to define and achieve Service Level Objectives (SLOs) for MongoDB performance and reliability
- Continuously monitor MongoDB systems using tools like Prometheus, Grafana, and database-specific observability solutions to identify and address performance bottlenecks proactively
Incident Response & Resilience
- Be part of a PagerDuty rotation to respond to MongoDB-related incidents, minimizing downtime and impact on the business
- Conduct root cause analyses for MongoDB failures and implement preventive measures to improve resilience
- Develop and maintain playbooks for incident response and recovery, ensuring the team is equipped to handle any MongoDB-related challenges
Collaboration & Knowledge Sharing
- Partner with other teams to integrate MongoDB as a self-service platform, reducing the need for manual intervention.
- Share expertise through documentation, training, and mentoring to empower the broader engineering organization.
- Contribute to a culture of operational excellence by creating robust standards and best practices for running MongoDB on Kubernetes.
Innovate & Automate
- Continuously evaluate emerging tools and technologies for managing MongoDB and Kubernetes, integrating them where appropriate
- Develop and implement self-healing mechanisms to minimize operational overhead and improve MongoDB uptime
- Automate manual processes, including scaling, maintenance, upgrades, and cluster health checks, to improve efficiency and reduce human error
WHO YOU ARE
- 5+ years managing database platforms in production environments, with 2+ years specifically focused on MongoDB
- Hands-on expertise with Kubernetes, including deploying and managing stateful workloads
- Proven experience automating database operations using Infrastructure as Code (IaC) tools such as Terraform, Ansible, or Pulumi
- Deep understanding of MongoDB internals, including sharding, replica sets, and query optimization
- Proficiency in Kubernetes concepts like StatefulSets, Operators, and Persistent Volume Claims
- Strong knowledge of cloud services (AWS/GCP/Azure) and their storage integrations for MongoDB workloads
- Familiarity with monitoring and logging tools tailored to databases (e.g., Prometheus, MongoDB Atlas monitoring)
- Dedicated to building robust, scalable, and self-serviceable MongoDB systems that empower developers and reduce operational complexity
- Committed to collaboration, documentation, and knowledge-sharing across remote, global teams
- Proactive in seeking out ways to improve MongoDB performance, reliability, and automation
- Focused on delivering value quickly to internal stakeholders while maintaining operational excellence
For candidates based in the United States, the pay range for this position at the start of employment is expected to be between $154,800 and $275,400/year with an expected On Target Earnings (OTE) between $172,000 and $306,000/year (including bonus or commission). Your exact offer may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. In addition to cash compensation, Braze offers full- and part- time employees a comprehensive Total Rewards package that includes equity grants of restricted stock (RSUs) so that all Braze employees own a piece of our company.