Description

WHAT YOU'LL DO

Platform Infrastructure Engineers (PIEs) specializing in MongoDB are responsible for managing, maintaining, and evolving the infrastructure required to run MongoDB at scale on Kubernetes. PIEs ensure that MongoDB operates as a reliable, performant, and developer-friendly service, enabling seamless integration with applications and supporting Braze’s massive data operations.

They apply sound engineering principles, operational discipline, and advanced automation to build and maintain the core infrastructure for MongoDB, ensuring it supports Braze’s 3.3 billion monthly active users and the hundreds of billions of data points processed every month. PIEs collaborate with other engineering teams to make MongoDB scalable, self-serviceable, and aligned with Braze’s broader platform goals.

Responsibilities:

Design and Manage MongoDB Infrastructure

Build, optimize, and manage MongoDB clusters on Kubernetes, ensuring they meet scalability, availability, and performance requirements
Develop automation frameworks for provisioning, upgrading, scaling, and maintaining MongoDB clusters
Design architectures that support seamless MongoDB operations, including multi-region deployments, sharded clusters, and replica sets
Create and optimize storage configurations, resource allocation, and networking to ensure the highest MongoDB performance

Ensure MongoDB Reliability & Performance

Implement high-availability strategies for MongoDB, including automated failovers, backups, and disaster recovery
Collaborate with database engineers, Platform Software Engineers, and product teams to define and achieve Service Level Objectives (SLOs) for MongoDB performance and reliability
Continuously monitor MongoDB systems using tools like Prometheus, Grafana, and database-specific observability solutions to identify and address performance bottlenecks proactively

Incident Response & Resilience

Be part of a PagerDuty rotation to respond to MongoDB-related incidents, minimizing downtime and impact on the business
Conduct root cause analyses for MongoDB failures and implement preventive measures to improve resilience
Develop and maintain playbooks for incident response and recovery, ensuring the team is equipped to handle any MongoDB-related challenges

Collaboration & Knowledge Sharing

Partner with other teams to integrate MongoDB as a self-service platform, reducing the need for manual intervention.
Share expertise through documentation, training, and mentoring to empower the broader engineering organization.
Contribute to a culture of operational excellence by creating robust standards and best practices for running MongoDB on Kubernetes.

Innovate & Automate

Continuously evaluate emerging tools and technologies for managing MongoDB and Kubernetes, integrating them where appropriate
Develop and implement self-healing mechanisms to minimize operational overhead and improve MongoDB uptime
Automate manual processes, including scaling, maintenance, upgrades, and cluster health checks, to improve efficiency and reduce human error

WHO YOU ARE

5+ years managing database platforms in production environments, with 2+ years specifically focused on MongoDB
Hands-on expertise with Kubernetes, including deploying and managing stateful workloads
Proven experience automating database operations using Infrastructure as Code (IaC) tools such as Terraform, Ansible, or Pulumi
Deep understanding of MongoDB internals, including sharding, replica sets, and query optimization
Proficiency in Kubernetes concepts like StatefulSets, Operators, and Persistent Volume Claims
Strong knowledge of cloud services (AWS/GCP/Azure) and their storage integrations for MongoDB workloads
Familiarity with monitoring and logging tools tailored to databases (e.g., Prometheus, MongoDB Atlas monitoring)
Dedicated to building robust, scalable, and self-serviceable MongoDB systems that empower developers and reduce operational complexity
Committed to collaboration, documentation, and knowledge-sharing across remote, global teams
Proactive in seeking out ways to improve MongoDB performance, reliability, and automation
Focused on delivering value quickly to internal stakeholders while maintaining operational excellence

For candidates based in the United States, the pay range for this position at the start of employment is expected to be between $154,800 and $275,400/year with an expected On Target Earnings (OTE) between $172,000 and $306,000/year (including bonus or commission). Your exact offer may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. In addition to cash compensation, Braze offers full- and part- time employees a comprehensive Total Rewards package that includes equity grants of restricted stock (RSUs) so that all Braze employees own a piece of our company.

Remote Scouter

More Similar Roles...

Want more remote roles like this one sent to you?