Threats to the internal validity of randomized experiments

Fill In Your Name

29 August, 2021

Key points for this lecture

  • Attrition (missing data on outcomes)

  • Non-compliance

  • Spillovers

  • Hawthorne effects

  • Differential treatment of treatment arms

Core assumptions

Review of core assumptions

  1. Excludability

  2. Non-interference

  3. Random assignment of subjects treatment

Attrition

Attrition (missing data on outcomes)

  • Some units may have missing data on outcomes (= units attrit) when:

    • some respondents can’t be found or refuse to participate in endline data collection.

    • some records are lost.

  • This is a problem when treatment affects missingness.

    • For example, units in control may be less willing to answer survey questions.
    • For example, treatment may have caused units to migrate and cannot be reached
  • If we analyze the data by dropping units with missing outcomes, then we are no longer comparing similar treatment and control groups.

What can we do?

  • Check whether attrition rates are similar in treatment and control groups.

  • Check whether treatment and control groups have similar covariate profiles.

  • Do not drop observations that are missing outcome data from your analysis.

  • When outcome data are missing we can sometimes bound our estimates of treatment effects.

What can we do?

  • But the best approach is to try to anticipate and prevent attrition.

    • Blind people to their treatment status.

    • Promise to deliver the treatment to the control group after the research is completed.

    • Plan ex ante to reach all subjects at endline.

    • Budget for intensive follow-up with a random sample of attriters.

Missing data on covariates is not as problematic

  • Missing background covariates (i.e.,variables for which values do not change as a result of treatment) for some observations is less problematic.

    • We can still learn about the causal effect of an experiment without those covariates, as we saw in the Hypothesis Testing and the Estimation modules.

    • We can also use the background covariate as planned by imputing for the missing values.

    • We can also condition on that missingness directly.

Non-compliance

Non-compliance

  • Sometimes units assigned to treatment don’t take it. They don’t comply with their assignment.

    • If all units assigned to control do not take the treatment, but only some units assigned to treatment take the treatment, we have one-sided non-compliance.
  • The effect of treatment assignment is not the same as the effect of receiving the treatment.

  • The effect of receiving the treatment is often called the “Local Average Treatment Effect” (LATE) or “Complier Average Causal Effect” (CACE).

    • “Local” refers to the idea that the effect only occurs on the people who take the treatment when assigned to treatment (the kinds of people).

LATE/CACE

  • We need two assumptions to hold in order to estimate LATE/CACE from a randomized experiment.
  1. The exclusion restriction is that the assignment to treatment only has an effect on the outcome through the receipt of treatment and through no other channels.

  2. The monotonicity assumption is that we do not have any defiers – units that would take the treatment if assigned to control, but not take the treatment if assigned to treatment.

Spillovers

Spillover and interference between units

  • Sometimes we suspect that treatment assigned to one unit affects other units’ outcomes.

  • If the treatment status of a unit affects another unit’s outcome, we have a violation of one of the core assumptions for causal inference.

  • You might sample units that are far from each other to prevent the “transmission” of treatment across units.

Studying spillover effects

  • This is not a problem if you design a study to investigate spillovers in which a unit’s outcomes may depend on other units’ treatment status.

  • To investigate spillovers:

    • You might collect outcome data for units that were never eligible for random assignment to treatment but for which nearby units from which spillovers might occur were eligible for treatment.

    • You might use a two-stage randomization design.

    • You might assign collections of units (e.g., districts) to different levels of intensity of treatment (e.g., different proportions of villages in districts assigned to treatment).

Hawthorne effect

Hawthorne effect

  • The Hawthorne effect: knowing that one is being observed/studied can lead subjects to behave differently.

  • This could create systematic measurement error in both treatment and control groups.

  • It could also result from greater attention given to the treatment group, effectively undoing the creation of equivalent treatment and control groups through randomization.

Good practices

  • Collect data as unobtrusively as possible.

  • As much as possible, no one other than the subject him/herself should know his/her treatment status.

  • Blind enumerators/researchers to subjects’ treatment status.

  • Don’t take extra measurements of the treatment group.

Non-excludability

Differences between treatment and control groups other than the treatment

  • Handling the treatment and control groups differently means that observed differences in the outcomes between these groups may be due to the treatment and/or the different handling.

  • Examples include using different instruments for data collection or additional rounds of data collection for the treatment group in an effort to obtain data on mechanisms.

  • Remember to design your study and your data instruments to treat all treatment arms the same, other than the treatment itself.