PPS Sampling

Below are some notes on PPS that I didn't get to in class.

Probability Proportionate to Size Sampling (PPS)

  • PPS—each cluster sample is given a chance of selection proportionate to its size

In a standard design:

  • Let’s say there are 1,000 city block in a city and I plan to select 100 blocks

o   So each block would have the following chance of being selected: 

  • but this design assumes a relatively even distribution of the population over these city blocks
  • what if 50% of the city’s population lived in high rises that took up only 100 blocks; the other 50% of the population lived on the remaining 900 blocks-- we would run a huge risk of entirely missing a good proportion of the city’s population, wouldn’t we?
In a PPS design, you still choose the same number of households from each area, but you give the area with lower density a lower probability of being selected.
  • In other words, what you want to do is give Block A a better chance of being selected BUT you also want to insure that if you do choose Block B, each household there has the same probability of being selected as the households in Block A
  • You know that Block A has 100 households and Block B only has 10
    • so Block A is 10 times more dense than Block B; 
    • we should give Block A 10 times better chance of being selected or Block B 10 times less chance of being selected
  • Let’s use the example from above where we worked out a research design that means each household has a 1/100 chance of being selected.

o   We would then multiply that chance by the difference in size density between Block A and Block B—here 10 x , so

§  Block A = 1/100

§  Block B = 1/100 * 1/10= 1/1000 chance of being selected

o   Now we've changed the probability of the the blocks being selected; Block A has a 1/100 chance of being selected whereas Block B has a 1/1000 chance of being selected-- we're more likely to choose Block A.

o   If we’re planning to choose 5 households, doing the same calculations as above, we find the following:

o   So each household still has an equal chance of being selected, but Block B itself has a lower chance of being selected; NEVERTHELESS, when Block B is selected, the households there have an equal chance of being selected as any household in the city.

o   Intuitively, this should make sense to you since Block B is less dense.


Disproportionate sampling and Weighting

n  comes into play for different reasons

n  sometimes we selectively choose to oversample a group b/c we’re particularly interested in it

n  sometimes we just can’t get a large enough sample of a particular population

n  if everyone in your sample did not have an equal chance of selection, then you have to assign a weight to the sample

n  Keep in mind that as long as you plan to look at the samples separately or comparatively, you don’t have to worry about weighting your sample, but if you want to combine your clusters and say that it’s representative of the whole city, then you have to take this into consideration.