Module 8 Measurement

To estimate effects and test hypotheses, we often use an outcome of interest measured with quantitative data from surveys, behavioral games, or administrative records. For causal questions, we typically use data on immediate and final outcomes and core mechanisms. We use baseline data to identify relevant subgroups, adjust our estimates, or help block-randomize our treatment. Measurements should be valid and reliable. Be aware that data can be noisy (random error) and/or biased (systematic error).

This module discusses what to measure and how to measure. It shows how good measurement is closely linked to your research design and statistical power.

8.1 Core Content

When we represent some attribute of a unit by some number, letter, word, or symbol in some systematic way (perhaps in a cell in a simple dataset), we are measuring.
A valid measure of a concept or phenomenon of interest should clearly represent that underlying and often abstract entity.
A reliable measure of a concept would provide the same score for the unit of measurement (for example, a person or a village) if conditions were not changed.
We can assess our theories of measurement using multiple approaches to measuring outcomes, covariates, or differences between units implied by different accounts of causal mechanisms.
Invalid measurement can make it hard for your research design to effectively distinguish between alternative explanations for the relationship between treatment and outcome.
Unreliable measurement can diminish statistical power.
Difficult measurement may call for a pilot study focused on measurement itself.

8.2 Slides

Below are slides with the core content that we cover in our lecture on measurement. You can directly use these slides or make your own local copy and edit.

8.3 Resources

8.3.1 EGAP Methods Guides

EGAP Methods Guide 10 Things to Know about Measurement in Experiments
EGAP Methods Guide 10 Things to Know about Survey Design
EGAP Methods Guide 10 Things to Know about Survey Implementation

8.3.2 Books, Chapters, and Articles

Robert Adcock and David Collier, “Measurement Validity: A Shared Standard for Qualitative and Measurement Validity: A Shared Standard for Qualitative and Quantitative Research.” American Political Science Review 95, no. 3 (2001): 529–546.
Alexandra Scacco and Shana S. Warren, “Can Social Contact Reduce Prejudice and Discrimination? Evidence from a Field Experiment in Nigeria,” American Political Science Review 112, no. 3 (2018): 654–677.
William R. Shadish et al., Experimental and Quasi-Experimental Designs for Generalized Causal Inference/William R. Shedish, Thomas d. Cook, Donald T. Campbell. (Boston: Houghton Mifflin, 2002).
Pedro C. Vicente, “Is Vote Buying Effective? Evidence from a Field Experiment in West Africa,” Economic Journal 124, no. 574 (2014): F356–87.

8.3.3 EGAP Policy Briefs

Using survey data at multiple levels

EGAP Policy Brief 58: Does Bottom-Up Accountability Work?

Using text messages

Using administrative data

References

Adcock, Robert, and David Collier. “Measurement Validity: A Shared Standard for Qualitative and Measurement Validity: A Shared Standard for Qualitative and Quantitative Research.” American Political Science Review 95, no. 3 (2001): 529–546.

Scacco, Alexandra, and Shana S. Warren. “Can Social Contact Reduce Prejudice and Discrimination? Evidence from a Field Experiment in Nigeria.” American Political Science Review 112, no. 3 (2018): 654–677.

Shadish, William R., Thomas D. Cook, Donald Thomas Campbell, and others. Experimental and Quasi-Experimental Designs for Generalized Causal Inference/William R. Shedish, Thomas d. Cook, Donald T. Campbell. Boston: Houghton Mifflin, 2002.

Vicente, Pedro C. “Is Vote Buying Effective? Evidence from a Field Experiment in West Africa.” Economic Journal 124, no. 574 (2014): F356–87.