Randomization – The Survey Geek

This issue came up in DP the other day when a quex came through with a question design that was fairly difficult to program. It started with a question that had a long list of potential responses (8 in all) and rather than asking it as a ranking question (good design choice!) the first question asked the R to make his first choice and then he got the same set back in the next question but with his first choice removed and asked to choose his second choice. The tricky part was that the design specified a randomized list in the first question but then asked that the same order be maintained for the second. The programming to accomplish this is not straightforward; the software would rather randomize the set of remaining responses again. So the question to me was whether there were methodological reasons to put in the effort to maintain the same order from the first to the second question. My answer was, "No." Here’s why.

We randomize answers sets when we are concerned that Rs may not read all of the answers to a question before they make their selection. The case at hand was a Web survey, and the methods literature tells us that in a visual mode like Web or mail Rs are somewhat more likely to choose from the top of a list of answers than from further down the list, especially if the list is long or has no apparent order to it. This is called primacy. In an aural mode, like telephone, the tendency is more to choose from among the last answers heard. This is called recency. We randomize the answer order to correct for this because it ensures that across all Rs every answer has an opportunity to appear at the top, in the middle, and at the bottom of the list. Put another way, it ensures that all answers have an equal chance to be selected across the entire survey.

Now to apply this principle to the case at hand. Let’s assume the R gets to number five and by then he knows his first choice is number three. He stops there and makes his selection. On the next screen when he sees the same order he may read the remaining three answers he didn’t read last time, or he already may know that number five is his second choice because he debated between it and number three on the last question. So he never bothered to consider the last three answers in the list. The better design is to randomize the answer set in the second question as well so that it mazimizes the liklihood that all answers are read. And luckily in this instance, it’s also the easiest thing for the interviewing software.

All of the above is grounded in the sad reality that Rs don’t consistently give their full cognitive energy to every question in every survey. Sad, I know. While we can’t do much about that on an individual R basis, we can at least use techniques like randomization to minimize its impact across all Rs.

We sometimes randomize questions as well, but that’s another issue I’ll save for another time.