
Data Engineer
The Company works in the Alzheimer’s space. We are building AI models that seek to uncover clues on early detection and to predict the course of the disease based on patient signatures.
We are seeking individuals that can work with the AI Engineers to harmonize very large databases that can be used in AI models. The Data Engineer will work closely with the AI team and harmonize longitudinal study databases, ensuring that the database is free of errors and consistent with other databases that are used in the AI models. For the summer we will give you one or at most 2 databases to harmonize.
Responsibilities:
- Extract variables of interest from multiple databases.
- Harmonize variables across databases.
- Ensure correctness of data ranges based on data dictionaries.
- Process and encode databases for AI algorithms.
Requirements:
- Knowledge in data handling using Python and common libraries in the data science industry.
- Fluency in pandas is required, along with knowledge of other libraries and/or techniques for exploratory analysis and data profiling.
- Statistical foundation to understand and report on the variables within a dataset.
- Skilled at working on projects with high-dimensional data and/or clinical data is a plus.
Qualifications:
- Pursuing a BA / BS / MA in any STEM field, Systems Engineer, Social Scientist, Psychology, or other.
Soft Skills:
- Independence. The Position is remote.
- Teamwork / Proactivity
- Attention to detail