Understanding the data curation choices behind the indicator: SDGs and LSMS-ISA measures of progress

Friday, October 11th 12:00pm – 2:00pm

    • Leigh Anderson is the Marc Lindenberg Professor for Humanitarian Action, International Development, and Global Citizenship in the Evans School of Public Policy and Governance at the University of Washington. She founded and directs the Evans School Policy Analysis and Research Group (EPAR) and has extensive experience in data curation, with a focus on how cleaning and variable construction decisions can affect analytical results. The EPAR website contains information on agricultural development data curation, which motivates our proposed workshop. Leigh also teaches a master’s level course on development economics, with a partial focus on the Sustainable Development Goals (SDGs).
    • Ayala Wineman is a Research Associate in the Evans School of Public Policy and Governance at the University of Washington. She earned a Ph.D. in Agricultural, Food, and Resource Economics from Michigan State University, and her research relates broadly to rural development in Sub-Saharan Africa. Within EPAR, she has led several research projects on the topic of measurement, exploring the implications of variable construction decisions related to crop yield, the delineation of urban populations, and the reporting of seed type in household surveys.
  • Our goal is for workshop participants to become more critical consumers and producers of common development indicators. Broadly, we hope to build and strengthen a “community of practice” around data curation in the realm of sustainability and development.
  • Introduction – The workshop will begin with an overview of data curation, which includes, inter alia, cleaning, variable construction, merging, aggregating, and managing.

  • SDG indicators – The first half of the workshop (1.25 hours) will focus on the SDGs and how SDG indicators are tracked in various dashboards. We will use a combination of lecture, discussion, and group activities to examine several specific SDG indicators. For example, to understand the calculation of national poverty levels (SDG indicator 1.1.1), small groups will work through a decision tree with decision nodes that produce a multitude of tree branches and a wide set of final values for two countries (derived using real data). In the process, participants will understand the sequence of decisions behind the reported indicator and will observe that the ordering across countries sometimes changes with different indicator construction choices. To understand the calculation of unemployment rates (SDG indicator 8.5.2), participants will work in small groups to make a realistic policy decision of funding for unemployment initiatives. With the provision of additional information on the construction of unemployment rates, participants will build awareness of the implications of indicator construction decisions for policy. Other SDG indicators related to conference themes can be similarly explored.

  • Data curation decisions using LSMS-ISA – Following an intermission, the second half of the workshop (1.25 hours) will delve into a set of narrower topics related to data cleaning and variable construction using LSMS-ISA data, with a focus on the production of agricultural indicators. We will consider variable definitions, such as the delineation of the urban population, and will discuss how inconsistent variable definitions affect meta-analyses intended to synthesize findings across multiple studies. We will also discuss decisions within variable construction, including how to deal with nonstandard units of measure and how to approach self-reported information that may include systematic biases (e.g., farmers’ reports of seed type). Participants will explore the implications of making different decisions in variable construction, completing a short activity to find that time trends in land productivity in Tanzania differ markedly, depending on whether the denominator is generated from fields that are partially cropped versus land area directly under crops. We will also discuss approaches to identifying and addressing outliers
  • Participants should emerge with an understanding of data curation and a shared language around the topic and tools. 
  • Among the workshop outputs, we hope to collectively generate a list of questions that should be posed before a reported indicator is applied in one’s work. 
  • Finally, participants should build a keener awareness that data curation decisions have implications for analysis. Associated Stata .do files will be made available to workshop participants.
  • Participants should gain an understanding of the framework of data curation. As this framework is not set in stone, we anticipate that it will be refined in the course of workshop discussion—This is the basis of building a “community of practice” around this topic.