Practical Random Sampling

January 28, 2019

[This is the third post in a series on sampling. Please see How much is enough - sample size considerations and ​​Setting your study up for success: Conducting a power analysis.]

It is no secret that a random sample stands apart among all other ways of sampling a population. The reason it does is that a random sample is far less likely to be biased. Formally, a random sample is one in which every individual in a group has an equal chance to be part of a study or intervention. If each individual truly has an equal chance to be in the study, then the sample you pull will be less likely to be swayed by a subgroup of that population.

There is a lot wrapped up in all of those statements. So let's start with how to draw a random sample.  It is not just a chaotic sampling process. In fact, it is very precise.

Drawing a random sample retrospectively

Let's say you want to take a random sample of 100 inpatients and ask them some questions about how the food at the hospital tasted after they were discharged. You would begin with a list of all of your inpatients from the period of time you care about (such as the last month). In Excel, SPSS or SAS, you would add a column to the list of patients and, using the software's random number generator, compute a random number in the column for each patient.

Finally, sort the file by the random number column, and choose the first 100 cases. Because every patient had an equal opportunity to be in the first 100 cases, the sample is random.

Drawing a random sample prospectively

Lots of times we want to sample patients who use a particular clinic. It can be very difficult to get a random sample in this way because of no-shows and other problems in the clinic workflow. But this method should help you be as random as possible:

Let's say you want 100 patients selected over a month from the emergency department, where you have no idea of who is coming in or when. Let's say that the ED gets 250 patients in an average week.  To get to 100 patients in a month, you will want to sample 25 each week (25 for each week of the month to reach 100).  You need a system where each patient has a one out of ten chance of being selected (or 25 out of 250):

(1) Compute the ratio of the patients you want to ask to participate (in this case, you will want about one out of every ten patients).

    • Total patients in a week/Total patients to recruit in a week=number of slips, or 250/25=10

(2) Add 10 slips of paper to a jar, nine with the number 0 on it and one with the number 1. 

(3) As each patient registers in the emergency department, draw out a slip of paper. If the slip has a 0 on it, return it to the jar.  If the slip has the number 1 on it, invite the patient to participate in the study and return the paper to the jar

(4) If the patient does not agree to participate, he or she become part of your sample as a 'nonrespondent' – they are not treated as a patient who drew a slip with a zero on it. They are part of your sample of 100.

(5) Put the paper back in the jar, and repeat the process for each patient.  Be sure to shake the jar between patients.

Using this method, you should end up with approximately 100 sampled patients over the course of a month.  Each patient had an equal chance to participate in the study, regardless of the time of day, the day of the week, or the week of the month.

You can also use a computer program to generate this for you, as is often done for marketing campaigns.

Check the representativeness of your sample

Even a random sample can be biased. It is just less LIKELY to be biased than a non-random sample. As a result, it is standard practice to check to see if your sample is representative.  If it isn't, you may be able to manage it, or you might not.  But you need to know.

In every sample you take, there will be important biases you are trying to avoid. In healthcare, we often want the sample to be unbiased by age, race, severity of illness and insurance type.

In the case of the emergency department sample, simply run the frequencies of these key characteristics of the folks chosen to be in your sample (including nonrespondents) with those of the whole population you were sampling from (all the patients who went to the emergency department during the month you were studying). Are they statistically significantly different? If not, your sample should be representative.

Checking nonresponse bias

Remember that you have three groups of patients — those who were not sampled (for whom you pulled a zero from the jar), patients who were sampled and agreed to participate, and patients who were sampled but chose not to participate.

It is also standard practice to check that those who chose not to participate in a study are not systematically any different than those who participated. For instance, if nonrespondents were sicker or less verbal than respondents, your respondents would not be representative of the whole sample.

You will want to run exactly the same comparison between respondents and nonrespondents as you did between sampled and non-sampled patients. Again, if the frequencies are not different to a statistically significant degree, you should be ok.

In the next post, we will talk over what to do if your samples are NOT representative of the population under study, even when you use random sampling methods.

Subscribe to The Why Axis

Subscribe now to have updates from The Why Axis delivered to your inbox.

Please leave a comment

Comments will be moderated.