Q&A: how do you prep for a usability study?

When I mentioned that I was going into the lab to do a usability study, I got a mail asking:

How do you prepare for a usability study?  What materials do you produce?

In this case, I got a request from the Zimbra team to help them understand the behavior of their users.  I met with the team a few times to understand their needs, and then prepared a research plan.  For this study, the research plan was a single page.  It contained the research questions and how I planned to answer them, as well as a discussion of the type of users that I will recruit to help us answer these questions.  Once the team bought off on the research plan, I got down to the real work.

First, I produced a draft of the task list.  The task list is the list of stuff that I’m going to ask my participants to do in the study.  Creating a task list is at least as much art as it is science.  You have to create tasks that feel natural and appropriate for the environment that you’re studying, and you have to avoid leading language.  In the case of Zimbra, this isn’t exactly easy.  I was interested in several scenarios in the calendar, but here are some words that I can’t use in a task list: meeting, appointment, event, recurring, repeating.  How do I tell a participant to create a meeting without actually telling them a word that they’ll see on the screen as they do the task?

The task list had several iterations.  I iterated on it a couple of times by myself.  Once I felt like I had a good draft, I sent it to the Zimbra team to ensure that I wasn’t missing something that we wanted to study.  I met with my fellow researchers to ask them for feedback on it.  I did a pilot study to check for time and flow (more on that in a minute).  All of this got incorporated into the final task list.  For this study, I ended up doing three major revisions of the task list from my first draft through to the final document that I used in the study.

The task list turned into the moderator script.  The moderator script is a superset of the task list, and it covers everything that I say during the study.  It also notes the different ways that a participant could complete the given task.  Having this information immediately at hand helps me to follow what the participant is doing during the study.  It also makes it easy as I’m taking notes, since I can just jot down that the participant took Path A through the interface.

Then there’s a bunch of ephemera associated with simply running the test.  I use a checklist between participants to make sure that I get the testing environment set up correctly for each person.  I’ve got an end-of-day checklist too, which reminds me to do things like print off any needed materials for the next day’s participants, ping those participants to remind them what time they’re scheduled to come visit my lab, and send any schedule updates to the team.  For my own note-taking purposes, I create a Word document for each participant, which contains all of the notes that I take during that participant’s session.

I also have an observer survey.  I ask any member of the team who comes in to observe a participant to note their top three observations during the study.  This helps me if I miss something.  It also allows me to see the study through the eyes of someone who isn’t necessarily well-versed in a usability study.  Their comments often help me to craft the report of research results afterwards because I have some additional insight into how they see both their application and their users.

I always do a pilot study before I actually begin the study.  If I’m running the study on live code that isn’t going to change during the study, then I do this the day before the study is to begin.  If I’m using a prototype (paper, Flash, Flex, whatever), then I do it a few days before the study is set to begin.  This pilot allows me to make sure that the testing environment is working properly, the study flows properly, and everything fits into the allotted time.  I always uncover one problem during the pilot study.  It never fails.  If it’s live code, then the issue is usually just with my task list, which I can update quickly.  If it’s a prototype, then an issue in the prototype usually takes a little bit more time to fix, which is why I do the pilot earlier to accommodate for that.

Then I conduct the study, which is the easiest part of this.  During the study, I don’t do much other than collect data.  I save that all for what happens after the study, which is probably a blog post on its own (if there’s any interest, that is).