Description
Job Highlights
· Location: Remote, must be based in the United States
· Salary Range: $115,000-$165,000, plus benefits
· Position Type: Grant funded, limited-term opportunity
· Position End Date: June 30, 2025
\n- The Data Engineer will play a crucial role in advancing the CDC Foundation's mission by designing, building, and maintaining data infrastructure for a public health organization. Working within the Alaska Division of Public Health, Section of Public Health Nursing, the Data Engineer will play a crucial role in transforming and loading data into the new Electronic Health Record (EHR) system. This position goes beyond mere database merging; it involves creating a unified, patient-centered database to replace the current location-based system and will serve as the foundational EHR. The Data Engineer will handle complex ETL tasks such as data mapping to establish relationships between disparate data sets, data transformation to standardize and convert data formats to the new SQL database structure, data cleansing to ensure data integrity, and data preparation to import efficiently with minimal downtime and service disruption. This role requires collaboration with data content experts, analysts, end-users, IT staff and other organizational personnel to design and implement solutions that meet the needs of the Division.
- The Data Engineer will be hired by the CDC Foundation and assigned to the Alaska Division of Public Health, Section of Public Health Nursing. This position is eligible for a fully remote work arrangement for U.S. based candidates
- Establish relationships between disparate databases to ensure accurate and effective data consolidation.
- Create and manage the systems and pipelines that enable efficient and reliable flow of data, including ingestion, processing, and storage.
- Collect data from various sources, transforming and cleaning it to ensure accuracy and consistency. Load data into storage systems or data warehouses.
- Optimize data pipelines, infrastructure, and workflows for performance and scalability.
- Monitor data pipelines and systems for performance issues, errors, and anomalies, and implement solutions to address them.
- Implement security measures to protect sensitive information, ensuring data handling practices comply with relevant regulations and standards, particularly HIPAA.
- Collaborate with data scientists, analysts, and other partners to understand their data needs and requirements, and to ensure that the data infrastructure supports the organization's goals and objectives.
- Collaborate with cross-functional teams to understand data requirements and design scalable solutions that meet business needs.
- Implement and maintain ETL processes to ensure the accuracy, completeness, and consistency of data.
- Design and manage data storage systems, including relational databases, NoSQL databases, and data warehouses.
- Create system architecture diagrams, documentation, and guidelines to communicate design decision, compliance measures, and best practices.
- Knowledgeable about industry trends, best practices, and emerging technologies in data engineering, and incorporating the trends into the organization's data infrastructure.
- Provide technical guidance to other staff.
- Communicate effectively with partners at all levels of the organization to gather requirements, provide updates, and present findings.
- Bachelor's degree in computer science, information technology, data science, or a related field.
- Strong experience in SQL, Python, C#, Java, Data Warehouse and building scalable ETL pipelines. Candidate should be able to implement data automations within existing frameworks as opposed to writing one off scripts.
- Strong understanding of database systems, including relational databases (e.g., SQL Server and Oracle).
- Experience regarding engineering best practices such as source control, automated testing, continuous integration and deployment, and peer review.
- Familiarity with Data Transformation Systems (DTS) for moving data to SQL Server systems.
- Knowledge of data deduplication techniques to ensure data quality and reduce redundancy.
- Experience with data modeling and moving from a location-based database to a person-centric database model, preferrable in the healthcare field.
- Preferred skills include experience with MUMPS DB/VA Fileman data extraction and understanding its unique features such as a hierarchical database systems and built-in string manipulation functions.
- Knowledge of data warehousing concepts and tools.
- Experience with cloud computing platforms.
- Expertise in data modeling, ETL (Extract, Transform, Load) processes, and data integration techniques.
- Familiarity with agile development methodologies, software design patterns, and best practices.
- Understanding of data security and compliance standards, especially HIPAA.
- Strong analytical thinking and problem-solving abilities.
- Excellent verbal and written communication skills, including the ability to convey technical concepts to non-technical partners effectively.
- Flexibility to adapt to evolving project requirements and priorities.
- Outstanding interpersonal and teamwork skills; and the ability to develop productive working relationships with colleagues and partners.
- Experience working in a virtual environment with remote partners and teams.
- Proficiency in Microsoft Office.
- This role is involved in a dynamic public health program. As such, roles and responsibilities are subject to change as situations evolve. Roles and responsibilities listed above may be expanded upon or updated to match priorities and needs, once written approval is received by the CDC Foundation in order to best support the public health programming.
- This role is involved in a dynamic public health program. As such, roles and responsibilities are subject to change as situations evolve. Roles and responsibilities listed above may be expanded upon or updated to match priorities and needs, once written approval is received by the CDC Foundation in order to best support the public health programming.