Nonresponse bias happens when those unwilling or unable to take part in a research study are different from those who do.
In other words, this bias occurs when respondents and nonrespondents categorically differ in ways that impact the research. As a result, the sample is no longer representative of the population as a whole.
Example: Nonresponse bias
Suppose you are researching workload among managers in a supermarket chain. You decide to collect your data via a survey. Due to constraints on their time, managers with the largest workload are less likely to answer your survey questions.
This may lead to a biased sample, as those most likely to answer are the managers with less busy schedules. Consequently, your results are likely to show that manager workload in the supermarket chain is not very high—something that may not, in fact, be true.
Nonresponse bias can occur when individuals who refuse to take part in a study, or who drop out before the study is completed, are systematically different from those who participate fully. Nonresponse prevents the researcher from collecting data for all units in the sample. It is a common source of error, particularly in survey research.
Causes of nonresponse include:
Usually, a distinction is made between two types of nonresponse:
It is important to keep in mind that nonresponse bias is always associated with a specific variable (like manager workload in the previous example). Respondents and nonrespondents differ with respect to that variable (workload) specifically.
Because managers’ decision to participate or not in the survey relates to their workload, the data is not randomized, leading respondents and nonrespondents to differ in a way that is significant to the research.
Nonresponse bias consists of two components:
The extent of bias depends on both the nonresponse rate and the extent to which nonrespondents differ from respondents on the variable(s) of interest. This means that a high level of nonresponse alone does not necessarily lead to research bias, as nonresponse can also be due to random error.
Example: When does nonresponse lead to bias?
Suppose you are running a survey on information literacy. You notice that some respondents miss the email that contains the link to the survey. As a result, they never get the chance to answer your questions.
Does this mean that nonresponse bias is present in your research?
It may, but only if:
and
If nonrespondents missed your email due to poor computer skills, then this makes them a distinct group in terms of a unifying characteristic (i.e., poor computer skills). This skill is relevant to your research (information literacy).
If nonrespondents missed your email simply because it ended up in their spam folder, then this is due to random error. In this instance, nonrespondents don’t share any characteristics that set them apart from respondents.
The response rate, or the percentage of sampled units who filled in a survey, can indicate the amount of nonresponse present in your data. For example, a survey with a 70% response rate has a 30% nonresponse rate.
The response rate is often used to estimate the magnitude of nonresponse bias. The assumption is that the higher the response rate, the lower the nonresponse bias.
However, keep in mind that a low response rate (or high nonresponse rate) is only an indication of the potential for nonresponse bias. Nonresponse bias may be low even when the response rate is low, provided that the nonresponse is random. This occurs when the differences between respondents and nonrespondents on that particular variable are minor.
Tip
As a rule of thumb, the lower the response rate, the greater the likelihood of nonresponse bias. Nonresponse bias becomes an issue when the response rate falls below 70%.
Nonresponse bias can lead to several issues:
Note
Keep in mind that nonresponse bias is not the opposite of response bias. Response bias refers to a number of factors that may lead survey respondents to answer untruthfully.
Nonresponse bias is a common source of bias in research, especially in studies related to health.
Example: Nonresponse bias in health surveys
In a case-control study assessing the link between smoking and heart disease, the selected sample is invited to participate by filling in a survey sent via mail.
Unfortunately, nonresponse is higher among people with heart disease, leading to an underestimation of the association between smoking and heart disease. This is a common problem in health surveys.
Studies generally show that respondents report better health outcomes and more positive health-related behaviors than nonrespondents. They often report lower alcohol consumption, less risky sexual behavior, more physical activity, etc.
This suggests that people with poorer health tend to avoid participating in health surveys. As a result, nonresponse bias can affect the results.
It’s possible to minimize nonresponse by designing the survey in a way that obtains the highest possible response rate. There are several steps you can take that will help you in that direction:
To minimize nonresponse bias during data collection, first try to identify individuals in the sample that are less likely to participate in your survey. These could be individuals who are hard to reach or hard to motivate.
It’s a good idea to prepare strategies that may incentivize their cooperation. Some ideas could include:
During data analysis, the goal is to identify the magnitude of nonresponse bias. Luckily, the nonresponse rate is easy to estimate. However, identifying whether the difference between respondents and nonrespondents is due to a particular characteristic is not so easy.
There are a number of ways you can approach this problem, including: