Surveys and online experiments are powerful research tools for collecting data that can test your hypotheses (or generate new ones).
Unfortunately, bad quality data can destroy your research. This article focuses on to design your study to get quality results.
Why is high-quality data important?
Our resources are finite, and often precious.
If you are devoting your precious time and money to conducting a study, you want to maximize that investment. You can do this by eliminating low quality data and low quality participants from affecting your results.
Getting high-quality data is important in providing useful, reliable results that allow for a deeper understanding of your research topic.
The bottom line is that the quality of your data directly impacts the quality of your study and that in turn affects what you can learn from it.
Where does poor quality data come from?
Low-quality data in your studies can come from a variety of places. It’s important to understand what can cause these unwanted responses so you can try to prevent their occurrence.
From the study design
Everything from the general structure down to the way questions are worded can influence the people taking the survey. This affects the overall credibility of the data.
Poorly designed questions
When questions aren’t clear or lead the respondent to answer a certain way, it can skew all of your results. Answers that are not authentic and participants that are influenced or compromised can make your data ineffective.
For example, “how much do you think prices will fall?” already assumes that the prices will decrease. This question can only work if it was preceded by a positive answer on “Do you think prices will drop?”
Quick tip 1: Ask good questions
Comprehension and language
If language is challenging for respondents to understand, or a study contains too much technical jargon, there’s a problem. They may just randomly choose responses and not understand what the questions are asking.
Randomly selecting survey responses without careful consideration can negate your responses as they are not representative of the participant population.
For example, if you ask someone, “what was the state of cleanliness in the room?” could be confusing to some people taking the survey. Instead, you should ask “how clean was the room?” to avoid any confusion.
See Tip 14: Use comprehension quality checks
Misaligned incentives
If you have incentives for participation you may inadvertently incentivise lying. For instance, this can be exacerbated when they can get a higher reward for answering a question in a specific way
Say you are offering $2,000 to participants who are CEOs of Fortune 500. Some will lie to receive the incentive if there’s no way to verify their response. This can have an extreme impact on your data.
Quick tip 2: Pick the right incentives
See Tip 10: Use pre-screened participants
See Tip 15: Use knowledge checks
From the participants
There are many reasons why participants may provide low quality data. Often you can predict this by putting yourself in their shoes.
Empathy is key in study design.
Disinterest
If your study is uninteresting to participants it increases their likelihood of offering poor quality responses.
Quick tip 3: Try to make your study interesting
Fatigue
Participants can become fatigued if a survey is particularly long or hard to parse. In these cases they may offer up answers just to simply finish the survey.
These answers will not be useful to you.
Quick tip 4: Keep your studies as short as possible
Quick tip 5: Randomise question order wherever possible to distribute the fatigue evenly across your questions
Low focus
Participants can sometimes be multi-tasking, unmotivated, unfocused or distracted. This can mean that they skip over instructions and miss important details. It can result in low-comprehension responses that are not valuable to you.
See Tip 13: Use comprehension screening
Speeding
There are participants out there that simply try to complete the survey as fast as they can. This can be for many reasons – ranging from trying to complete lots of surveys to completing something they’re obliged to do. If this happens, you won’t be getting considered and thoughtful answers.
Quick tip 6: Force the participants to slow down where possible. For example, using the *wait function in GuidedTrack programs.
See Tip 18: Exclude participants who are too fast
From fraudulent accounts
Bots
Bots consist of software designed to interact with surveys and provide responses by clicking through with no human interaction. There is also a risk of pseudo-bots which are humans assisted by browser plugins that autofill data inaccurately.
Bots and pseudo-bots provide no value to you as the responses are completely random. These are more likely to infiltrate surveys that are all multiple choice. Bots are not equipped to complete free-text responses as cohesively as humans.
Quick tip 7: Use bot-preventing platforms (like Positly)
Quick tip 8: Check for within scale consistency
See Tip 21: Include free text responses
Inauthentic participants
Participants may lie or submit false screening information to increase the number of surveys they are eligible for. For example, they could be seeking to qualify for lots of surveys to earn more incentives.
This means their responses could muddy your data and damage your research.
See Tip 22: Perform inconsistent logic checks
See Tip 10: Use pre-screened participants
General tips to combat low-quality data
We’ve gone through some specific causes of low-quality data along with 8 quick tips for how to solve it. We’ll now go through some more detailed tips that you can follow to ensure you have high quality data.
Tip 9: Design surveys carefully
Getting access to and collecting high-quality data starts with a well-written and well-designed questionnaire that focuses on the survey objectives.
Survey questions can influence responses, so it’s critical to craft thoughtful survey questions to generate reliable responses. Stay away from pushing participants towards a certain opinion, or response. Each question needs to be phrased in a way that allows respondents to answer in their authentic way.
For example:
- Using neutral words will help you stay away from pushing participants towards a certain opinion.
- Avoiding loaded questions that lead people to answer insincerely.
If you have questions about how to ask good survey questions, read our previous article about asking good survey questions.
Tip 10: Use pre-screened participants
Starting with pre-screened participants will help you find those who already match certain demographic requirements or quality standards.
Using a tool like Positly can help with delivering you pre-screened participants. If there’s a particular screening criteria that you need, get in touch. If you plan to pre-screen participants yourself I recommend you , read our tips for screening success.
Tip 11: Use social norming
Social norms affect all of us. This can be great for research. We encourage you to emphasise the importance of the study and why it’s important to be a good participant.
It can be helpful to include a question where participants confirm and agree that their responses are earnest.
On the Positly platform we include this in our demographic pre-survey before they are referred onto the main study.
Tip 12: Implement quality control checks
Performing quality checks throughout your survey can be an effective way to reject data from undesirable participants.
Quality checks are important to ensure that you are getting high-quality participants that offer high-quality data. However, implementing too many of these tactics can lead to biased data or may frustrate your participants.
Positly does pre and post survey checks automatically to keep track of the quality of our participant pool. This helps to prevent poor quality participants. You may also want to include additional ones in your longer studies.
However, be careful not to inadvertently exclude good participants by using checks that are too difficult. Another failure mode is using checks where people have a strong heuristic to answer in a particular way.
The next few tips are some of our favourite types of quality checks that correlate with overall quality, without unnecessary bias.
Tip 13: Use ‘bogus’ quality questions
A bogus, or fictitious, question, is one that asks about something that does not exist. Its inclusion can help you understand whether participants are attempting to answer questions that they are not qualified to answer. If answered incorrectly, it can raise red flags about the quality of this participant’s data.
Bogus questions are best used in surveys that measure recognition of, or previous experience with, people, places, or things.
For example, you could conducting a pre-election poll where it is important to knowing the names of the candidates. By providing a bogus option you can identify if they are knowledgeable on the topic. This helps you to decided whether to include their data.
When using bogus questions be very careful to validate that your bogus questions only correlate with lot quality responses.
Tip 14: Use comprehension check quality questions
To receive high-quality data, you need to ensure that you are receiving high-quality participants, starting with proper screening questions. It’s important that you ask high-level questions that quality respondents for your survey while excluding those that fail the screening criteria.
One trick to screening participants within your study is to implement a multiple choice pop quiz after the study instructions. If the participant fails, they are redirected back to the instructions to ensure that they are reading and understanding what is asked of them.
For example, if you need to make sure that people are giving answers by comparing themselves to people who live in their neighborhood, you’d first explain that in detail, then give a pop quiz with a multiple choice question asking “who are you comparing yourself to for this study?” If they don’t get it correct, they will go back to re-read the instructions until they pass that quiz.
Tip 15: Use knowledge checks
If your study requires a particular type of participant and want to check that they have correctly identified themselves it can help to use knowledge checks.
Often you can achieve anonymity and confirm the right participants by asking questions that only the right participants would know how to answer correctly.
For example, if a participant identifies that they are an Auto Part Technician, but they cannot identify the functions of basic car parts, they are proving that they do not have the fundamental knowledge to complete the survey questions. These participants should be excluded from the survey.
See this example for people with Donor Advised Funds.
Tip 16: Use attention checks
These are questions scattered throughout the survey, asking participants to answer a question that will confirm or deny their comprehension and attentiveness to your study.
Stay away from high heuristic questions where participants can use mental cues to speed up their decision-making progress. This introduces errors in their judgment and can lead to increased prejudices. As people use mental shortcuts to classify and categories people and information, they can overlook more relevant information and offer answers that are not in tune with reality.
Tip 17: Use consistency checks
These validation checks require participants to answer two related questions at different points in the survey to confirm that they are who they say they are and weed out any respondents who are not answering truthfully.
For example, as participants to type their age and later in the survey ask them to select their age bracket in a multiple choice question. The answers should both align, and any contradictions should raise red flags when it comes to the quality of your data.
Be careful to only exclude those with more egregious errors because small variance is normal, especially with people on the extreme ends of ranges.
Tip 18: Exclude speeding participants
Some participants will speed through your study and not give thoughtful, in-depth responses to your survey questions. To identify if a participant is speeding, take a look at the median amount of time it takes to complete your survey and plot where respondents fall in respect to that median.
It is common to eliminate people who complete the survey in less than ⅓ of the median survey time. For example, if the survey had a median completion time of one hour, you can assume that anyone who finished the survey in 20 minutes or less was speeding and you can remove them from the data set.
Tip 19: Exclude straightlining participants
You will notice straightlining when you see participants answering similarly in several questions or using the same pattern on scaled questions. It’s a red flag to see a straight line in a series of multiple choice questions, but it doesn’t necessarily mean that those participants should be excluded.
For example, some participants may actually feel very strongly about a topic and emphatically agree or disagree with your questions. Adversely, the participants could be responding in that way due to fatigue or boredom.
You can avoid this by reverse coding (see next paragraph), or avoiding grid-style questions altogether, keeping options to a minimum.
Calculating the standard deviation of responses is a quick way of identifying straightlining.
Tip 20: Reverse code questions
One way to avoid flatlining responses from participants is to use reverse coding in your grid, or scaled, questions. To do this, you can ask the same question, but frame it negatively.
Here is an example of two questions, framed in different ways to account for inattentive, bored, or flatlining participants.
- When traveling short distances, I generally prefer to walk instead of using an automobile.
- Strongly agree
- Agree
- Neutral
- Disagree
- Strongly disagree
- If given a choice between walking or using an automobile to travel a short distance, I would prefer to use an automobile.
- Strongly agree
- Agree
- Neutral
- Disagree
- Strongly disagree
Tip 21: Include free text responses
Introduce free text response questions to your survey to determine if respondents are carefully thinking about their written responses, or simply typing nonsensical words to complete the question. An attempt to answer the question coherently shows that the participant is attentive and their answers can be trusted. These can then be compared with the other quality checks in your study.
Tip 22: Perform inconsistent logic checks
In addition to quality checks, you will want to perform logic checks as well to spot any inconsistencies. For example, if a respondent says that they are 18 years old and hold a Ph.D. it is a strong signal that they may be providing low quality inconsistent responses.
Another type of inconsistent logic check is to ask mutually exclusive questions (ones that cannot both be true) and check for their logical consistency.
The ultimate goal of high-quality data
We’ve covered the importance of high-quality data, but it is absolutely essential in your ability to draw effective conclusions from your data results. Without high-quality data, your study findings will not be reliable, or credible.
Clean, reliable data allows you to draw robust and valid conclusions that you can act on. Not to mention it will give you the confidence to present your data findings and be confident in your research.
Positly is here to help provide you with high quality pre-screened participants. We also love helping our research design effective research
Feel free to read more on our research methods blog or get in touch to chat about your research design or study requirements.
Our top priority is providing you with the resources you need to produce effective high-impact research!