About the Role
As a Platform Engineer at Lucidworks, you will be part of the team responsible for leading our efforts to ensure the reliability, availability, security, and performance of our critical systems and services. You will collaborate with cross-functional teams to design, build, and maintain robust, scalable, and efficient infrastructure, and to empower our engineers by building tools that enable them to take full ownership of their deliverables. Your expertise will be instrumental in driving our commitment to excellence in operations and customer experience.
\n- Collaborate with multiple product development teams to define SLOs and monitor platform reliability and performance against these objectives.
- Design, implement, and maintain monitoring and alerting solutions to detect and respond to incidents promptly.
- Conduct post-incident reviews (PIRs) to identify root causes and recommend preventative measures.
- Work with multiple teams to plan for capacity needs, including forecasting resource requirements and scaling infrastructure as necessary.
- Identify areas for improvement and drive initiatives to enhance the reliability and efficiency of our SaaS platform.
- Act as a bridge between multiple teams, sharing knowledge and best practices as an advocate and enabler of site reliability engineering throughout our organization.
- Continually strive to optimize our engineers’ efficiency by developing tools and processes that make things scalable and repeatable.
- Strive for complete automation, both in your own work and by leading and influencing developers to build scalable production solutions.
- Additional duties as assigned
- Infrastructure as code (Terraform)
- Scalable global cloud infrastructure (GCP, Kubernetes, Istio)
- Build and Deployment tools (GitHub, Jenkins, ArgoCD)
- Automation (Go)
- Service catalog (Backstage)
- Bachelor's degree in Computer Science, Engineering, related field or equivalent real-world experience
- 5+ years professional experience as a Platform Engineer or similar role in a production environment
- Knowledge of infrastructure provisioning tools like Terraform and a commitment to the infrastructure-as-code (IaC) approach
- Strong problem-solving skills and the ability to think critically in high-pressure situations
- Have an enthusiastic, go-for-it attitude; when you see something broken, you can’t help but fix it
- Have an urge for delivering quickly and effectively, and iterating rapidly
- Resourceful: willing to jump in, be agile/flexible, leverage existing resources to accomplish goals, and able to work independently
- Team player: confident in collaborating with a diverse community of people and personalities across geographies, backgrounds, and professional abilities
- Strong verbal and written communication skills
- Empathy and care for all stakeholders of Lucidworks including employees, customers, partners, and guests
- Ability to handle confidential information