There seem to be a fair few services these days offering automated remote usability testing, so I'm assuming there are a fair few people using those services, which implies that they ARE useful - under the right circumstances.
But... what are those circumstances? When do automated remote tests yield data that usefully informs the design process?
Note: I'm not asking about moderated remote tests, with live interaction with the user - those seem close enough to face-to-face tests to be uncontroversially useful.
I know this isn't directly responding to the question, but I don't know of any automated usability testing services and would never touch them; I can't imagine how that could possibly work! Unless it's simply a fancy name for some web-hosted cultural probe?
I guessing that you're talking about systems like Userfly that record the users interactions on a web site?
Personally I'm not convinced that they provide much value.
If you've got vaguely decent analytics you already know where the users are going on your web site and whether your goals are being met. Without being able to interact with the user is there that much extra information you can glean from watching the mouse pointer move around?
Like eye-tracking, just looking at where the mouse pointer is can be deceiving. People see the mouse as the users focus of attention – when you’ll often find folk in real test “fidgeting” with the mouse while there attention is on something else, or moving the mouse away from their current centre of attention because it was distracting. Spotting this kind of thing can be hard with this sort of recording.
It’s also harder to spot instance of the user looking like they’re trying to do X when they’re actually trying to do Y.
The key thing is the human observer. You learn so much by looking at the face of the person doing the task, asking questions afterwards (or during if it’s that sort of test.)
Userfly type applications are really, really useful at getting “what’s happening” type of information. And that’s great for identifying certain kinds of problem.
They’re really bad at getting the intent behind that behaviour.
To give one example. I was once involved in some usability tests where we observed this behaviour on the page that let folk log in (or register if they hadn’t already got an account.)
What were they trying to do?
1) Trying to log in with their username and password – and getting it wrong. Unable to figure out how to recover their username/password. Giving up and registering again.
2) Trying to register for a new account – and misunderstanding the page layout and using the login-box when they should have been using the registration box.
In this particular instance it was the latter. That would have been harder to figure out without being able to ask the participants questions because it was due to a moderately subtle layout issue.
userfly isn't a remote usability testing platform. You can't run usability tests on it. You can only observe screen recordings of user behaviour.
Really good question, and one we hear a lot. There are two circumstances for automated testing in our experience (we've conducted 238 remote studies over 10 years with 2,000 moderated participants and 3,000 automated participants):
Convincing people. You work for a big-ass orginization where it's simply easier to validate your moderated small-sample research with some large-sample data. Like you know what's wrong, but to convince all the stakeholders and executives you need some task-based data like "89% of visitors failed to complete the registration form." Most automated tools like Loop11, UserZoom, Keynote, etc - give a task bar in a frame that directs the users to complete tasks. That way you can show fancy graphs for 1,000 users that's task-based. This is the big advantage over traditional log file analytics that automated remote tools offer.
Opinions. Since surveys suck and only behavior repeats over small sample research, if you want to use a tool like MindCanvas (RIP) that records click maps, or some of the OptimalWorkshop tools like Chalkmark, or even remote card sorting, it can be a great report to give along with designs because, again, it looks like "real" science. And honestly, if you're doing nomenclature and IA it's nice to have 500 people complete a card sort instead of just 12.
We try and keep all this jazz on remoteusability.com but we're a little behind on the tools. If you are interested in more info, we also do these workshops called http://escapethelab.com where we cover both automated and moderated remote wizardry.
Automated usability testing is great in these situations (and probably many others):
Anyone want to add to this list? I'm sure there are plenty of other situations where automated usability testing is a better choice than lab-based testing.
I've used Chalkmark (from Optimal Workshop) to compare different versions of a proposed homepage redesign. I wanted to find out if 1 display of links worked better than another. Chalkmark worked well for this since I had very specific tasks (i.e. "where you click to find a ___?") and had a very specific goal.
I needed data to convince the project stakeholders their idea wasn't going to work (Nate mentions this above). With only a couple hours of design time I had 2 versions being tested. To create a prototype and actually run a usability tests would've taken much more time.
The test was set up, I set up the email invite (emailed 4,500 participants) and then got back to work on another project.
In this case automated testing was incredibly helpful and valid.
I have recently led a complex user experience design project that required a new experience strategy and evidence-based design concepts. Our research had to cover a broad range of geographical locations, demographic backgrounds and behavioural profiles. However, we had to work with a small research budget and a tight timeline. Therefore, sending a UX researcher to different locations, managing the logistics and having access to as many participants as we required was unrealistic. I decided to use remote automated research. However, I needed a tool that will go beyond task-based behvioural measures and usability metrics as we needed to work with rich insights.
Following a review of a few options I decided to use Webnographer as they claimed that they could help us with both formative as well summative testing. Many other tools are indeed more summative and only give you raw statistics (e.g. completion rate, and time on task). By combing Webnographer and Guerilla testing we have saved just over 50% on UCD part of the project, yet we were able to generate rich user models that captured domain-specific key cognitive styles, decision making processes, behaviours, attitudes and brand associations. These models were subsequently used throughout the design process to inform our strategy, concepts and detailed designs.
From my experience of one test using Webnographer I could say that it holds up to its promises of giving both formative as well summative results. However, I needed the help of the team at Webnographer to both help in the design of the test and the analysis of the results. The team there are knowledgeable and where always available to answer a question, this is critical in doing a remote study for the first, or even second time.
Would I use Webnographer again? Yes absolutely. I have just commissioned them to start a new test a few days ago.
Assuming "automated" means "unmoderated," one possible use could be longitudinal tests. Let's pretend you have a shiny new support application used by a team of sales people every day. You might be able to use automated testing to assess the learning curve. E.g. how much more competent does the sales force become with experience?
OTOH, you might be able to see how certain metrics (error rate, completion time, etc.) perform over a series of improvements.
Obviously, you won't get good qualitative data with automated testing. Like Web analytics, however, it can help suggest areas that need attention.
Automated usability testing has absolutely no value, and automated, remote usability testing has even less value (yes, I know that technically isn't possible). You can't get anything out of the test without being able to observe the user. The only kind of data that you can get from this kind of test is the same kind of data that you can get from analytics, only less accurate because you have a much smaller data set to work with and people aren't really using the application like they would in the real world.
New article on UXmatters: Unmoderated, Remote Usability Testing: Good or Evil?
Following on from Nate's post, you need to appreciate that remote testing is all about quantitative data.
We developed a usability dashboard for a client based on a remote testing tool that showed performance on key tasks compared to competitors. Printed on a single sheet of paper, that 'report' changed the minds of more senior managers in the organisation than the previous 12 months' worth of small sample tests. Here's an example of a usability dashboard you can produce using these tools.
I think the reason views on this are so polarised is that people think they need to choose between large sample remote testing and small sample 'think aloud' testing. In fact, you should be running both kinds of test. Think of remote testing as just another tool in your toolbox.
Offhand the cases I can think of are :
When you don't have direct physical access to your users but you have an idea about the capability of the users in handling the test requirements to provide you with the data needed
You can test without having the need to set up a controlled environment and can observe the interactions of people in their own natural environment. This will provide you with very valuable information about how external disturbances can affect your test results which might otherwise be conducted in a controlled environment in-person testing
You can test a large number of users simultaneously
When you are working for an organization where its impossible to get all the stakeholders to sit in at an usability test to see whats going wrong,having an automated usability remote test will allow you to bring in data which can then be passed on to the stakeholders to drive your design or your decision behind the path you are taking
When you want to do a time based analysis of the interaction of the users with your site and where the conversion rates are suffering - This is so because if you run these automated tests at regular intervals,you can use the data from different time intervals to create snapshots which help in defining differences
I have tried a few and to make the responses more valuable users are tested with several questions (mainly a background check) to see if they fit the persona needed somehow. At the end of those series of questions and based on the answers, the site deems the user qualified or not. This is a really time consuming process and the user may be discouraged halfway because the depth of questions may vary based on the persona.