3  Randomization 1

4.1 Some common types of randomization

  1. Simple

  2. Complete

  3. Block

  4. Cluster

  5. Block-Cluster

  6. Design: Multi-arm

4.2 1. Simple randomization (coin-flipping)

  • For each unit, flip a coin to see if it will be treated. Then you measure outcomes at the same level as the coin.
  • The coins don’t have to be fair (50-50), but you have to know the probability of treatment assignment.

4.3 1. Simple randomization (coin-flipping)

4.4 1. Simple randomization (coin-flipping)

  • Advantage: Simple randomization can handle not knowing the total size of your sample in advance.
  • Disadvantage: You can’t guarantee a specific number of treated units and control units.
simple_ra(3)
[1] 0 1 1
simple_ra(3)
[1] 0 0 0

4.5 2. Complete randomization (drawing from an urn)

  • A fixed number \(m\) out of \(N\) units are assigned to treatment.
  • The probability a unit is assigned to treatment is \(m/N\).
  • This is like having an urn or bowl with \(N\) balls, of which \(m\) are marked as treatment and \(N-m\) are marked as control. Public lotteries use this method.

4.6 2. Complete randomization (drawing from an urn)

4.7 2. Complete randomization (drawing from an urn)

  • Advantage: Particularly useful when you have a small number of units to avoid having all units in only one condition.
  • Disadvantage: You need to know the total number of units in advance.
complete_ra(4)
[1] 0 0 1 1
complete_ra(4)
[1] 0 0 1 1

4.8 3. Block (or stratified) randomization

  • We create groups of units (blocks) and randomize separately within each block.
  • We are doing mini-experiments in each block so we have both treated and control units in each block.

4.9 3. Block (or stratified) randomization

  • Example: block = region, units = municipalities. We randomize treatment at the municipality level within region and also measure outcomes at the municipality level.

4.10 3. Block (or stratified) randomization

  • Blocks can be of different sizes.

4.11 3. Block (or stratified) randomization

  • Blocks can have different probabilities of treatment assignment.

4.12 3. Block (or stratified) randomization

How should you define your blocks?

  1. Create subgroups for which you want to learn the ATE. The average causal effect for a particular subgroup is known as a Conditional Average Treatment Effect (CATE). You can use these to learn differences in CATEs for one group as compared with another group.

4.13 3. Block (or stratified) randomization

How should you define your blocks?

  1. By variables that predict the outcome. This will increase the precision of your estimates.

4.14 3. Block (or stratified) randomization

  • Advantage: You avoid unlucky randomizations that create treatment and control groups that differ on the variables used to create the blocks. Blocks are generally very helpful.
  • This is especially useful for rare subgroups.
  • But we need data to form the blocks before the randomization.

4.15 3. Block (or stratified) randomization

blocks <- c(1,1,1,1,2,2,2,2)
block_ra(blocks = blocks)
[1] 0 0 1 1 0 1 1 0
block_ra(blocks = blocks)
[1] 1 0 1 0 1 1 0 0

4.16 4. Cluster randomization

  • In a cluster-randomized study, all units in a group of units (the cluster) are assigned to the same treatment status.

4.17 4. Cluster randomization

  • Treatment is randomized at the cluster level.
  • Outcomes are measured at the unit level.

4.18 4. Cluster randomization

  • When should you do cluster randomization? If the intervention has to work at the cluster level.
  • Don’t if you can avoid it! Cluster randomization generally reduces statistical power. How much depends on the intra-cluster correlation (ICC or \(\rho\)). Higher \(\rho\) is worse.

4.19 4. Cluster randomization

  • If you must use cluster randomization, having more clusters helps.
  • Having fewer clusters hurts our ability to detect treatment effects and may cause misleading \(p\)-values and confidence intervals (or even estimates).

4.20 4. Cluster randomization

my_clusters <- c(1,1,1,1,2,2,2,2)
cluster_ra(clusters = my_clusters)
[1] 1 1 1 1 0 0 0 0
cluster_ra(clusters = my_clusters)
[1] 1 1 1 1 0 0 0 0

4.21 5. You can combine blocks and clusters

  • You can have clusters within blocks.
  • Can you have blocks within clusters?

4.22 5. You can combine blocks and clusters

my_blocks <- c(1,1,1,1,2,2,2,2)
my_clusters <- c(1,1,2,2,3,3,4,4)
block_and_cluster_ra(blocks = my_blocks, clusters = my_clusters)
[1] 1 1 0 0 0 0 1 1
block_and_cluster_ra(blocks = my_blocks, clusters = my_clusters)
[1] 1 1 0 0 0 0 1 1

4.23 6. Multi-arm design

  • You can randomize units to more conditions (arms) than just one treatment arm and one control arm.
  • For example, we might have a cash transfer arm, a job training arm, and a control arm with no intervention.

4.24 6. Multi-arm design

  • An example with complete randomization

4.25 6. Multi-arm design

  • Advantages: We can compare each treatment to control or to each other.
  • Disadvantages: We can quickly end up with a very large number of hypothesis tests, which can be a problem.

4.26 6. Multi-arm design

complete_ra(12, num_arms = 3)
 [1] T2 T1 T3 T2 T3 T1 T3 T1 T2 T1 T3 T2
Levels: T1 T2 T3
complete_ra(12, prob_each = c(.1, .2, .7))
 [1] T3 T2 T3 T3 T2 T3 T3 T3 T1 T3 T3 T3
Levels: T1 T2 T3

4.27 Resource

EGAP Methods Guide on Randomization: https://egap.org/resource/10-things-to-know-about-randomization/