TY - JOUR
T1 - What you get is what you see
T2 - Revisiting the evaluator effect in usability tests
AU - Hertzum, Morten
AU - Molich, Rolf
AU - Jacobsen, Niels Ebbe
PY - 2014
Y1 - 2014
N2 - Usability evaluation is essential to user-centred design; yet, evaluators who analyse the same usability test sessions have been found to identify substantially different sets of usability problems. We revisit this evaluator effect by having 19 experienced usability professionals analyse video-recorded test sessions with five users. Nine participants analysed moderated sessions; 10 participants analysed unmoderated sessions. For the moderated sessions, participants reported an average of 33% of the problems reported by all nine of these participants and 50% of the subset of problems reported as critical or serious by at least one participant. For the unmoderated sessions, the percentages were 32% and 40%. Thus, the evaluator effect was similar for moderated and unmoderated sessions, and it was substantial for the full set of problems and still present for the most severe problems. In addition, participants disagreed in their severity ratings. As much as 24% (moderated) and 30% (unmoderated) of the problems reported by multiple participants were rated as critical by one participant and minor by another. The majority of the participants perceived an evaluator effect when merging their individual findings into group evaluations. We discuss reasons for the evaluator effect and recommend ways of managing it
AB - Usability evaluation is essential to user-centred design; yet, evaluators who analyse the same usability test sessions have been found to identify substantially different sets of usability problems. We revisit this evaluator effect by having 19 experienced usability professionals analyse video-recorded test sessions with five users. Nine participants analysed moderated sessions; 10 participants analysed unmoderated sessions. For the moderated sessions, participants reported an average of 33% of the problems reported by all nine of these participants and 50% of the subset of problems reported as critical or serious by at least one participant. For the unmoderated sessions, the percentages were 32% and 40%. Thus, the evaluator effect was similar for moderated and unmoderated sessions, and it was substantial for the full set of problems and still present for the most severe problems. In addition, participants disagreed in their severity ratings. As much as 24% (moderated) and 30% (unmoderated) of the problems reported by multiple participants were rated as critical by one participant and minor by another. The majority of the participants perceived an evaluator effect when merging their individual findings into group evaluations. We discuss reasons for the evaluator effect and recommend ways of managing it
U2 - 10.1080/0144929X.2013.783114
DO - 10.1080/0144929X.2013.783114
M3 - Journal article
SN - 0144-929X
VL - 33
SP - 143
EP - 161
JO - Behaviour and Information Technology
JF - Behaviour and Information Technology
IS - 2
ER -