Measuring the right stuff

A few weeks back I saw a post by
online usability specialist Jakob Nielsen titled, “User Satisfaction vs.
Performance Metrics.” His finding is
pretty simple: Users generally prefer designs that are fast and easy to use,
but satisfaction isn't 100% correlated with objective usability metrics. Nielsen looked at results from about 300
usability tests in which he asked participants how satisfied they were with a design
and compared that to some standard usability metrics measuring how well they
performed a basic set of tasks using that design. The correlation was around .5. Not bad, but not great. Digging deeper he finds that in about 30% of
the studies participants either liked the design but performed poorly or did
not like the design but performed well.

I immediately thought of the studies we’ve all seen
promoting the use of flash objects and other gadgets in surveys by pointing to
the high marks they get on satisfaction and enjoyment as evidence that these
devices generate better data. The premise here is that these measures are
proxies for engagement and that engaged respondents give us better data. Well, maybe and maybe not. Nielsen has offered us one data point. There is another in the experiment we
reported on here
where we found that while the version of the survey with flash objects scored
higher on enjoyment, respondents in that treatment showed evidence of lack of
engagement at the same rate as those tortured with classic HTML. They failed some classic traps at the same
rate.

A cynic might say that at least some of the validation studies
we see are more about marketing than survey science. A more generous view might be that we are
still finding our way when it comes to evaluating new methods. Many of the early online evangelists argued
that we could not trust telephone surveys any more because of problems with coverage
(wireless substitution) and depressingly low response rates. To prove that online was better they often
conducted tests showing that online results were as good as what we were
getting from telephone. A few
researchers figured out that to be convincing you needed a different point of
comparison. Election results were good
for electoral polling and others compared their online results to data
collected by non-survey means, such as censuses or administrative records. But most didn’t. Studies promoting mobile often argue for their
validity by showing that their results match up well with online. There seems to be a spiral here and not in a
good direction.

The bottom line is that we need to think a lot harder about
how to validate new data collection methods.
We need to measure the right things.

Comments

One response to “Measuring the right stuff”