We follow a minimum set of rules to ensure high-quality reproducible research. ECONS maintains a suite of technical tools, training, and support staff so that researchers can maintain compliance with these research protocols. Please email [email protected] if you have any questions on our research protocol.
Steps to ensure data quality
• Create a survey plan before launch: The survey plan is an operational plan that covers timelines, staffing needs, logistics, and procurement for your survey, for all stages including questionnaire development, training, piloting, tracking, interviews, and quality assurance. Delegate responsibilities with a well-defined Gantt chart.
• Create data quality assurance plan and materials before launch: The data quality assurance plan lays out in detail the requirements for backchecks, high-frequency checks, accompaniments, spot checks, and major analytical outputs to assure quality. The scope of the data quality assurance plan should not only include technical products (e.g. customizing the high frequency check template, GIS analysis), but also data flow, roles and responsibilities, reporting schedules, actionable items based on output, and incentive programs for the field team. It also includes your staffing needs, which may change throughout the survey (e.g., accompaniments become less frequent later in the survey and audio auditor for checking random audio records which are more frequent at the end of the survey).
• Pilot survey: Every survey must be piloted before finalizing the questionnaire. If the questionnaire is exploratory, we advise doing the first pilot using a paper survey. The pilot should take place in communities outside your study sample. If resource permits, a second pilot is advisable before starting the training and after the bench test. Your second pilot should look as close to actual surveying as possible — you may even decide not to tell your field team it is a pilot. Ideally, every question that is included in the final survey should be piloted before launch. For surveys using DDC, a pilot should include field testing of both the survey program and devices. Remember to leave time to make corrections to errors you identified during piloting.
• Bench test survey (at least one week before starting the training): Bench testing means testing your survey in the office with a minimum of 2-3 different testers. You will save time and money by making sure your survey works well before launching field data collection. Bench testing is an iterative process wherein testers run the survey in different scenarios and provide feedback, while the programmer(s) make changes; note that even small changes to a survey must go through the bench testing process again, as it is easy to make mistakes that affect other parts of the survey. This process works best if the "paper" survey is considered mostly complete and has already been reviewed by central decision-makers on the project.
• Accompany surveyors in the first week of the survey: Field supervisors must accompany a subset of field officers' interviews to monitor field officer performance and to check for survey issues. All field officers must be personally accompanied at least once during the first week of the survey. Accompaniments can be scaled down as the survey progresses, especially by leveraging digital supplements like audio recordings and meta-data.
• Implement and act on high-frequency checks: High-frequency checks provide insight into ongoing field team and data quality concerns before they become too entrenched or too late to manage. By running HFCs, you can regularly analyze (comparative) field officer performance, compliance with ethics requirements, response frequencies and outliers, duplicates, and other project-specific data quality issues. HFCs are meant to provide the evidence needed to successfully guide and manage a field team daily, and thus must be accompanied by strict guidance on roles and responsibilities, reporting schedules, and triggered actions (e.g. what outliers would trigger re-interviewing a household). GIS analysis to track enumerator movement is an important part of HFC. Based on IRB approval and the respondent's consent, random recording of the part of the surveys and their audio auditing are important steps of HFC.
• Implement and act on backchecks: A backcheck (also known as a field audit or re-interview) refers to when a highly qualified field officer (also known as a backchecker) visits a respondent a second time to re-administer a selection of questions from the original questionnaire. Those backcheck responses are then compared to the original responses. The bcstats program in STATA helps you to identify discrepancies between answers, and thus to identify problems with your questionnaire, your field team, or both. Your quality assurance plan should have included a backcheck randomization plan, as well as an action plan for what to do when you encounter discrepancies.
• Double enter & reconcile paper surveys: Although paper surveying is now uncommon, there are strict protocols for data entry from paper surveys. Each survey must be entered by two separate data entry operators who cannot compare responses. When there are discrepancies between their entries, they must be reconciled by a third data entry operator who looks at the original survey closely. In-house data entry can be replaced by online firms, which also provide double entry and allow for you to review discrepancies against the original survey responses.
Data Security & Research Ethics
• Maintain IRB approval throughout the project lifecycle (e.g. submissions, renewals, amendments, human subjects certificates): Any study conducting human subjects research must have the approval of at least one Institutional Review Board (IRB); note that each project is different, so you should consult with your PIs and IRB Coordinator about how best to get IRB coverage for your project. A typical lifecycle includes approval of the initial research protocol, annual renewals, and amendments when critical items change, such as the questionnaire, staffing, research protocol, or risk level. All project staff, partners, and investigators who can see encrypted personally-identifying information (PII) must have up-to-date human subjects certificates. Any deviation from the protocol, or any unexpected risk to respondents, must be reported as unexpected events to the IRB.
• Create data security plan and set up encryption before launch: Respondents' confidential data should be encrypted at all stages, starting at the moment of data collection. This includes while it is on the data collection device, during wireless transmission, while on an external server (e.g., SurveyCTO), when it is on a cloud storage system (e.g., Box or Dropbox), and while on laptops and removable media (hard drives, flash drives). Any time the data is stored on a server that is not controlled by ECONS, it must be separately encrypted so that the company that controls the server cannot access the data. You must plan beforehand how you will ensure encryption at each of these steps, and how it will be maintained after your project has been officially closed if you are retaining any PII. If any un-encrypted data is uploaded to the cloud or e-mailed, you must file an unexpected event report to your IRB(s) and comply with any ruling they make.
• Maintain data security plan (especially encryption) throughout project lifecycle: At every stage of the project lifecycle, data should be properly protected. Among other things, this means personally-identifying information (PII) should remain encrypted during storage and transmission, and passwords should be restricted to the critical members of your IRB research staff.
• Use new UID in the deidentified dataset: When you share or publish un-encrypted data, it must be de-identified, i.e. there must be no identifying information in the dataset, such as name or address, or a combination of variables that can be used to identify a respondent. You should also replace your original unique identifier (UID) with a new unique identifier. You should do this at the end stage of your project when you have finished matching across waves or different data collection activities.
• Retire your project with all IRBs once the project is complete: Once your study is complete, you should retire or otherwise officially close out your IRB with all the reviewing IRBs.
Knowledge Management & Transparency
• Back up data in at least two locations: You must have at least two copies of the data available at all times. During data collection, this will likely mean on a SurveyCTO server, as well as on a laptop and synced to Box; do not delete server data until it has fully synced to Box, as laptop theft is somewhat common. Post data collection, could mean backing up your data on an external hard drive on the extremely rare chance that a major cloud service like Box or Dropbox fails.