Vacancy title:
DATA SCIENTISTS
Jobs at:
African Population and Health Research Center (APHRC)Deadline of this Job:
08 June 2022
Summary
Date Posted: Friday, May 27, 2022 , Base Salary: Not Disclosed
JOB DETAILS:
BACKGROUND
• The PEACH data project is hosted within the Implementation Network for Sharing Population
• Information from Research Entities (INSPIRE) (http://aphrc.org/inspire/).
• INSPIRE is building a generic model for health data from longitudinal population studies (LPS)
• using OMOP (Observational Medical Outcomes Partnership) database. The INSPIRE PEACH
• proposes to develop the key elements of a coordinated Pan-African COVID-19 data ecosystem.
• We will build a robust suite of data standards and technologies and diverse data integration
• methodologies, using the power of Artificial Intelligence and Data Science for analysis and
• oversight through a trusted governance and policy environment.
Development and Training
• On the job training will be provided by personnel within the INSPIRE, both within the employing
• institution and from other institutions that are affiliated with INSPIRE. The role holder will attend
• meetings and workshops held by the designated studies they are working on. There may be
• opportunities for further studies in Data Science commissioned by the Network as resources may
• allow; potential for a PhD subject to satisfactory performance and funding.
• Relationships
• The post holder will report to the Project Lead or Project Members within the INSPIRE in their
• institution, that is, the APHRC in Kenya. S/he will work very closely, and on a day-to-day basis
• with their counterpart(s) based at Malawi Epidemiology and Intervention Research Unit (MEIRU)
• in Malawi (https://meiru.lshtm.ac.uk/).S/he will have routine interactions with other INSPIRE
• partners affiliated with the African NCD Longitudinal Data Alliance (ANDLA), the Analyzing
• Longitudinal Population-based HIV/AIDS data on Africa (ALPHA) network based at the London
• School of Hygiene & Tropical Medicine (LSHTM) in the United Kingdom, South African Population
• Research Infrastructure Network (SAPRIN) in South Africa, and Committee on Data of the
• International Science Council (CODATA) in France.
Duties/Responsibilities
• The Data Scientist will be primarily responsible for defining data specifications needed for
• COVID-19 data and metadata including alternative data sources. S/he will be guided by
• population health knowledge gaps identified in the INSPIRE PEACH knowledge translation hub.
• Informed by cohort definitions, s/he will work with the Data Trackers to ensure the data they find
• is prepared to these specifications and will develop AI search programs for finding data that can
• populate these data specifications. S/he will extract, transform and load (ETLs) the data and
• associated metadata to data specifications defined by INSPIRE Network into the INSPIRE
• common data model. The Data Scientist would work with INSPIRE staff (including the Data
• Trackers) to build “on-ramps” which transfer COVID-19 related data from agreed data
• specifications to the INSPIRE Common Data Model. The Data Scientist would be expected to learn
• about the CDM using OMOP data under the direction of INSPIRE partners within the first six
• months of their employment.
• Additionally, the Data Scientist, working with the INSPIRE PEACH knowledge translation hub
• team again, will develop cohort off ramps for AI-infused population health research on top of the
• OMOP CDM. AI initiatives might include the construction of cohorts based on synthetic data that
• has been trained with real data as well as the conduct of simulated trials with population health
• “treatments” using synthetic and/or real data. These “treatments” would be in line with policy
• initiatives currently under review by MOHs. In this way, the Data Scientist would be responsible
• for developing, conducting and eventually analyzing the results of more or less “natural
• experiments”.
The Data Scientist will:
Data collection
• To prepare final list of the data specifications used by the INSPIRE network for COVID-19
• data.
• To perform data extraction work from source databases.
• To perform data profiling and quality assessments for the gathered data.
• Data processing and storage in database systems
• To transform collected COVID-19 data into the INSPIRE PEACH data exchange protocols.
• To ensure quality of the data transformations and the resulting data provided by data
• trackers for data specs.
• Develop on-ramps for putting data into OMOP CDM in consultation with INSPIRE partners.
• Implement the on-ramps using the data from Data trackers in Malawi and Kenya.
• Develop cohort off-ramps from the OMOP CDM suitable for the conduct of natural
• experiments with “treatments” that take the form of previous and future public health
• interventions.
• Write up the results of these experiments and place them in the INSPIRE PEACH knowledge
• base for vetting by the knowledge translation hub team and future publications.
• Data Cataloguing and sharing
• Develop minimum metadata requirements to accompany the source data.
• Manage and document the Common Data Model to ensure provenance of the data in the CDM.
• Support the data preparation of off – ramps data products including with metadata required
• for sharing data.
Overarching
• Ensure data standards are aligned with program and project priorities.
• Take part in training and workshops organized by INSPIRE, both physically and virtually.
• Under the direction of the INSPIRE team, engage with the training and mentoring of data staff
• of INSPIRE network members to ensure continuity of data and data provenance.
• Prepare monthly progress reports on their work.
• Inform and take directions from their line managers in INSPIRE to ensure continuity of data
• operations.
• Liaise with the team managing the CDM, including INSPIRE Network Partners, to ensure their
• work fits within the scope of the INSPIRE CDM.
• Attend meetings and workshops organized by INSPIRE, as required; the workshops may be
• around data management, upload, analysis, writing up and planning.
• Provide administrative support across work-streams; handle meeting invitations, bookings,
• training venues, training materials and support the organization of periodic meetings for the
• INSPIRE.
• Internalize the project work plan and anticipate administrative needs to support
• implementation and project work-streams. This will include working with partners to gather
• project requirements, maintaining a system for monitoring project activities, milestones and
• deliverables on a monthly basis as well as maintaining the INSPIRE learning platform and
• provide support to partners using the platform as needed.
• Prepare quarterly, intermediate and annual program status reports required for
• management and donors. These reports will reflect achievements made, challenges and
• solutions.
• Establish and maintain technical contacts with other stakeholders and partners, lead on
• communication with INSPIRE members and respond to queries as needed, provide
• information to concerned parties on progress, problems, required changes and document
• actions to the project’s implementation for the consideration of the team.
• Provide administrative support for proposal development for continued funding of the
• INSPIRE activities.
• Assist in completion of administrative forms and requests.
Qualifications, Skills, and Experience
• The ideal candidate would have worked with health data (preferably longitudinal health data),
• has experience with health and demographic surveillance systems (HDSS) and is familiar with
• the data procedures from INDEPTH network (http://www.indepth- network.org/).
• S/he would have excellent skills in data management and programming (relational databases).
• The expectation is for the team of Data Scientists (based in both Kenya and Malawi) to write
• programmes for data extraction, transform and loading (ETL) in a variety of languages. The ideal
• post holder will have the following:
• Master of Statistics, Data Science, M&E, Econometrics, Software Engineering,
• Demographic Research, Information Systems or equivalent in relevant area.
• At least 3-5 years’ post first-degree experience with data management of longitudinal,
• medical research studies and in handling large datasets.
• Knowledge of a programming language such as Python, Perl, R, JAVA, or equivalent and
• in ETL transfers.
• Experience of DB servers e.g., PostgreSQL, MySQL, SQL Server, or Oracle or equivalent.
• Experience querying databases using SQL language.
• Experience conducting and/or managing health/research projects.
• Experience in conduct and analysis of quantitative research.
• Excellent communication (written and spoken) and interpersonal skills.
• Strong organizational and program management skills.
• Ability to take initiative and work both independently and in teams.
• Fluent in English.
• This position is classified under Nationally Recruited Positions (NRP), Grade V in our scales. The
• appointment will be for a one –year period renewable subject to satisfactory performance and funding.
Job Experience: No Requirements
Work Hours: 8
Level of Education: Bachelor Degree
Job application procedure
Interested candidates are encouraged to apply Please click HERE.
• Only shortlisted candidates will be contacted;
• shortlisted candidates will be required to have a Police Clearance Certificate. Cover letters should be addressed to:
All Jobs
Join a Focused Community on job search to uncover both advertised and non-advertised jobs that you may not be aware of. A jobs WhatsApp Group Community can ensure that you know the opportunities happening around you and a jobs Facebook Group Community provides an opportunity to discuss with employers who need to fill urgent position. Click the links to join. You can view previously sent Email Alerts here incase you missed them and Subscribe so that you never miss out.