Description
Overview
The Data Engineer participates in the design and build of modern data products that comprise of raw data stores (data lakes) and cleansed data repositories, populated by batch or streaming data pipelines. The Data Engineer works with a team to create a robust, sustainable and flexible design and leads the technical delivery using Agile delivery frameworks like Scrum or Kanban.
Responsibilities
- Work closely with stakeholders across departments to architect, build and deploy various data acquisition initiatives across multiple-tenants.
- Participate in all stages of data pipeline development - from early brainstorming to coding and bug fixing
- Design, develop, deploy and maintain data services and/or pipelines to AWS
- Develop best practices and approaches to support continuous process automation for data ingestion and data pipeline workflows
- Perform multiple tasks simultaneously under changing requirements and deadlines
- Prepare and present proof of concept, solution evaluation and recommendation to various stakeholders including executives
Qualifications
Minimum Qualifications:
- 3+ years relevant professional work experience
- Design and develop SQL or python data pipelines that power our data lake and data warehouse
- Design and develop big data pipelines with both structured and unstructured data
- Comfortable with modern data orchestration tools like DBT, AWS Glue, Apache NiFi and Airflow
- Design and develop strategies to acquire data as product
- Experience with test-driven code development practices
- Experience with GitLab code development practices
- Comfortable to develop infrastructure as code such as CloudFormation or Terraform
Preferred Qualifications:
- Advocating for adopting industry tools and practices at the right time
- Appreciate the importance of schema design, and can evolve an analytics schema on top of unstructured data.
- Excited to try out new technologies and produce proof-of-concepts that balance technical advancement and user experience.
- Empathetic working with stakeholders, listen to them, ask the right questions, and collaboratively come up with the best solutions for their needs.
- Champion for data privacy and integrity, and always act in the best interest of consumers.
- Understanding of DevOps Research and Assessment (DORA) and the capabilities within the DORA capability catalog is encouraged.