The Plaid Machine Learning Infrastructure (ML Infra) team is responsible for creating and maintaining the foundational systems, tools, and processes that enable efficient, scalable, reliable, responsible and secure machine learning workflows for Plaid ML practitioners. The core ML Infra components and responsibility covering core areas but not limited to these:
1. Feature Store Development: Developing a robust and efficient Feature Store platform customized for Plaid use cases to streamline feature engineering for both batch and real-time streaming features.
2. Model Serving and Deployment: Developing and maintaining infrastructure and tools for deploying models into production with observability and low latency.
3. Data Exploration and Experimentation: Using both vendor-based solutions, like Sagemaker Notebook and in-house/open-source solutions, like ML-Flow and Trino, to support early model experimentation, model version monitoring, and tracking
4. ML Data Cost: Developing creative framework to forecast, monitor, attribute and track the cost for end-to-end ML development life cycle.
You will lead the machine learning Infra team to design and develop Feature Store Platform for Plaid. You will support the team to set the multi-year technical strategy and roadmap. You will maintain current infrastructure for ML developer environment, ML model training, ML data feature creation/serving, ML model hosting/serving, ML model management, ML model service monitoring, vendor services (Tecton, OpenAI, Sagemaker and etc), and the CI/CD pipelines/infrastructure deploying all assets.
You will also help pioneer on early foundation of LLM AI platform for Pl. Working with MLEs and product engineers to support different ML-based product lines and working with other Data Platform engineers to continuously improve overall Plaid data ecosystems.
\n- Passionate about ML infrastructure and how it can solve real-world problems, especially in fintech world.
- Both leading and hands-on contribution to build the Feature Store for entire Plaid.
- Shape the future of ML world for Plaid.
- Technical leadership in engineering excellence and mentorship.
- 8+ years of software engineering experience.
- Extensive hands-on software engineering experience, with a strong track record of delivering successful projects within the ML Infrastructure or Platform domain at similar or larger companies.
- Deep understanding of high-quality ML Infrastructure systems, including Feature Stores, Training Infrastructure, Serving Infrastructure, and Model Monitoring.
- Experience in developing Feature Store system.
- Strong cross-functional collaboration, communication, and project management skills, with proven ability to coordinate effectively.
- Demonstrated leadership abilities, including experience mentoring and guiding junior engineers.
- [Nice to have] Experience with AWS Sagemaker, Tecton, or LLM training/serving infra/platform.