Job Description
Day-to-day tasks may include but not be limited to:
- Assist the team in the creation of the necessary data pipelines to store and process data related to emergent sensors and platforms. This includes the creation of any database tables and schemas, establishing relevant data health alert, and the setup of data pipelines to support extract, transform, and Load (ETL)of all data related to the emergent sensors.
- Develop and document workflows to enable API-based movement of data related to the sensors between data platforms as directed by the Government.
- Ensure all sensor data adheres to the Governments directed data schemas formats, and governance requirement.
- Assess potential differences in metadata, data format, and data structure characteristics in regards to changes to databases, schemas, APIs, and other ETL related processes for the ingestion and movement of data when integrating into the existing data operations pipeline.
- Determine how to pre-process and standardize the data to match existing data standards or to be transformed into a useable state for labeling and model testing purposes. This may involve converting between file format types or tiling full-size images into specific sizes or geospatial bounds (chipping), orthorectification, or other types of data standardization, transformation, or manipulation.
