Description
Mission of the Role:
Architect and lead the delivery of high-quality and reliable solutions through creative problem-solving and technical expertise to address our business problems on a frequent and regular cadence. Enable Engineers on your team to improve the quality and impact of their work and delivery. Evangelize reliability-as-a-feature through monitoring, service-level objectives, automation, everything-as-code, and testing.
\n- Provide technical leadership and guidance to the SRE team, driving best practices in reliability engineering, automation, and service management.
- Set the direction for SRE projects, aligning them with organizational goals, and ensuring successful execution from concept to delivery.
- Helps define and instrument Service-Level Objectives to ensure the most excellent customer experience.
- Lead initiatives to improve system resilience and scalability.
- Hosts postmortems to share learnings, discover gaps, embrace transparency, and improve reliability across our services.
- Leads projects from inception to completion.
- Participates in an on-call rotation to assist in finding a resolution during incidents.
- 7+ years of experience building infrastructure solutions in AWS using Infrastructure-as-Code technologies such as Terraform or CloudFormation.
- 7+ years of experience working with Docker containers and related orchestration technologies (such as Kubernetes or ECS).
- 7+ years of experience building and deploying CI/CD pipelines.
- Experience with AWS, Docker, Kubernetes, Terraform, Python, PHP, and Laravel
- Experience with architectural patterns of large, high-scale applications, such as well-designed APIs and database schemas.
- Experience leading projects and initiatives that are wide in scale and complex in nature.
- Experience working collaboratively in cross-functional teams with engineers in product and data groups.
- Deep technical expertise; Writes, debugs, and refactors code while being mindful of tradeoffs, scalability, architecture, and code cleanliness.
- Demonstrates mastery of their craft to solve problems in automation, infrastructure, and/or developer tooling.
- Reliability & Quality; Experience leveraging observability tooling and practices such as SLOs to help engineering teams own the reliability and quality of the software they build.
- Leadership - Define and deliver large, complex projects that may include coordination with non-technical stakeholders. Help define the SRE function and be a champion for it throughout the organization.
- Competitive salary and equity packages
- Company Performance Incentive Plan
- Comprehensive benefits: medical, dental, and vision insurance for employees; flexible spending account; 401k; mental health & wellness programs
- Company Performance Incentive Plan
- $75 WFH stipend (remote employees)
- Home office setup stipend (remote employees)
- Minimum Time Off policy (unlimited PTO, with at least 3 weeks off) for exempt employees
- 11 company observed holidays
- Additional holidays: Curology days off (1 per quarter), 1 annual floating holiday (employee’s choice), and Gratitude Week (employees take the full week of Thanksgiving off; business critical teams observe different days)
- Paid parental leave
- Employee donation matching program
- Company-sponsored events
- Free subscription to Curology or Agency
$133,000 - $205,000 a year
\n