It seems like AI is almost everywhere. For many people, it is.
From the moment we wake up, AI increasingly shapes our daily experiences. Music playlists are generated automatically. Our computers prompt us to use AI assistants. Internet searches are now often preceded by AI-generated summaries.
Call a doctor’s office after hours. and an AI voice assistant may help schedule your appointment. Chat with customer support, and you’ll likely interact with a chatbot before reaching a human. Write an email, and AI offers suggestions. Start a meeting and AI software generates notes and summaries. Need an image to make a point? Use AI to generate one from a textual description (e.g., Figure 1).
Figure 1: The ubiquity of AI.
And of course, AI’s influence affects what we do in UX research.
But is AI helping? Is it making us more efficient, more accurate? Or is it actually just making us work more intensely?
Of course, there are voices who overhype its efficacy in UX Research and Design. There are also voices who dismiss it as a fad. Increasingly, the latter is becoming a less tenable position.
We’re more pragmatic at MeasuringU and have an aversion to extreme attitudes. The Aristotelian golden mean between extremes is part of our company DNA.
We lean into empiricism and judge the efficacy of claims using data. We also critically evaluate the quality of the evidence. An anecdote about improved productivity from a software company is not the same as a controlled study.
As is often the case with fast-changing technology, there’s a dearth of high-quality studies that allow us to separate the hype from the hypothesis testing. We’re actively conducting studies and literature reviews to quantify the extent to which different applications of AI to UX research are useful.
A good way to assess the evidence and group our research is to think about AI’s impact in UX research in three categories: AI as Research Assistant, AI as (Synthetic) User, and AI as Researcher.
AI as Research Assistant
Let’s start with something less controversial and rather commonplace. That is, researchers using AI tools to assist (usually expedite) research.
Many UX research teams use AI for the following tasks, and the AI assistants appear to be well-received by researchers to either increase research speed or improve research quality. Questions remain about measurable quality criteria, failure modes, and the role of the human in the loop.
- Coding comments from categories
- Cleaning data
- Translation and localization
- Analyzing interviews to find themes
- Developing insights from transcripts
- Building and modifying participant screeners
- Writing and editing survey questions
- Detecting bias and other quality issues in questions
- Identifying categories from card sort results
- Developing and editing task scenarios
- Developing and editing test plans
There’s more to do, but we’ve already made some progress investigating the role of AI as a research assistant in comment classification.
AI and Human Classification of Comments
One of the first analyses we conducted on using AI to code comments was promising. We used three runs of ChatGPT-4 to classify comments in UX research and compared its results (in 2023!) to three human coders. We found only slightly lower interrater reliabilities between human coders and ChatGPT than between human coders alone, with three caveats:
- Human coders were more likely to assign single comments to their own themes.
- Different prompts had different levels of effectiveness (prompt specificity matters).
- AI outputs with the same prompt were similar, but there was substantial variation, making it necessary to run AI analyses multiple times.
We plan to investigate how well newer AI products perform this task.
AI as Synthetic User: Synthetic Attitudes vs. Synthetic Actions
Now we move into a category that gets a lot of people fired up, and for good reason. Any time you take the user out of UX, it becomes objectionable as a matter of principle. But again, we try to be open-minded. After all, inspection methods like heuristic evaluation, PURE, and guideline reviews are part of the UX research toolbox even though users aren’t directly involved.
We see an important distinction between synthetic user attitudes and synthetic user behaviors, both of which have yet to be fully explored.
- Synthetic survey respondents (attitudes and reported behaviors): AI-generated responses to rating scales that measure things like satisfaction, intention, and usability, and responses to behaviors like product ownership and usage
- Synthetic users of task-based studies (behaviors): AI-generated responses to task-based scenarios used in usability testing
- Synthetic users of information architecture tasks (tree tests, card sorts)
We have not conducted studies that use data from individually crafted synthetic users, but we have experimented with comparing AI predictions of user behaviors and attitudes for card sorting and tree testing, with mixed success.
AI and Human Analysis of Card Sorting Results
AI’s ability to sort items into groups, as in a card sort, was actually reasonably good. Our use of ChatGPT-4 to appropriately name groups of items, with the groups synthesized by human researchers from a standard open card sort, found a strong similarity in numbers and names of categories. Items matched most of the time, the interrater reliability between the two methods was moderate to substantial, and there weren’t any obviously bad ChatGPT placements.
AI and Human Tree Testing Results
Our tree testing results were also promising. Based on data collected with multiple iterations of ChatGPT-4 and 33 participants finding the location of target items in a tree structure based on the IRS website, and using the SEQ to assess perceived task difficulty, we found that ChatGPT performed too well and was not suitable for estimating how well humans will find items in a tree test. However, ChatGPT predicted people’s ease ratings of the search tasks with reasonable accuracy.
AI as Researcher
These are more advanced tasks where AI might be able to take a more central role in analysis, but it isn’t clear how AI output compares to human output regarding the amount of time saved in the process (if any) and accuracy. Two ways in which AI might replace researchers are as analysts and moderators.
AI as Analyst
A lot of human data analysis is repetitive, making it attractive for replacement with AI (Figure 2).
Figure 2: Robot applying for a job.
Other human data analysis is less repetitive and more dependent on contextual knowledge and human judgment (e.g., identification of usability problems). Some of the opportunities we envision for AI as an analyst (but which need development and validation) are:
- Validating screenshots to determine task success
- Identifying usability problems from image analysis
- Identifying usability problems from videos
- Heuristic evaluation from analysis of videos, images, and websites
- Advanced inspection analyses (PURE, KLM/GOMS)
- Analyzing datasets
AI as Moderator
Research moderation seems like a quintessentially human activity. However, advances in AI avatars, LLM dialog management, and synthetic speech production have led to the development of AI agents that could be applied to a variety of moderation tasks. Research in this area should focus on understanding when it works, when it fails, and how to validate quality.
- Simple interviews
- Complex interviews
- Moderated usability tests
AI Adoption and Attitudes
We have conducted and will conduct follow-up studies of attitudes toward AI usage by the general public and by UX researchers.
We’ve already published research on attitudes of UX researchers regarding the use of AI in UX (in association with UXPA) and attitudes of a general population of users toward three AI chat products.
Before examining how AI may function as an assistant, analyst, or synthetic user in UX research, it’s useful to understand how widely AI tools are already being used and how people perceive them. Some recent studies provide insight into both adoption and user experience with AI-based systems.
How Much Is AI Used in UX?
More than you might think. While our industry data from 2024 is due for a refresh, we found that about half of UX professionals had used AI (but 20% were not impressed). More companies supported using AI than discouraged it (by about 6 to 1). Most respondents expected to use AI more in 2025, but expectations over the next five years were mixed.
Retrospective Benchmark of ChatGPT, Claude, and Gemini
In January and February 2025, we conducted a retrospective study on three AI-based chat products (ChatGPT, Claude, Gemini) with 153 U.S-based panel participants. This study included metrics from our standard UX & NPS survey as part of our larger consumer software data collection effort. All products had high and similar Net Promoter Scores. Reported issues included accuracy, generic content, and limited free versions.
Summary
It can be easy to be seduced into extreme views about emerging technologies. They can be cast as the best thing ever or a complete waste of time. Our recommendation is a more pragmatic, empirical approach. Rather than relying on anecdotes or hype, we encourage evaluating the role of AI in UX research with data.
One useful way to think about AI in UX research is to group its applications into three roles:
- AI as Research Assistant. Tools that improve the quality and quantity of the work that UX researchers already do, such as coding comments, summarizing interviews, and generating study materials.
- AI as Synthetic User. Systems that simulate user attitudes or behaviors. An important distinction is between synthetic attitudes and synthetic actions. Our early work suggests some promise in modeling behavior, but much less evidence for synthetic attitudes.
- AI as Research Analyst. Applications where AI plays a more central role in analysis—identifying usability issues from images or videos, evaluating task success, or even assisting with research moderation.
There is still much to learn. In the coming year, we plan to continue studying these areas and revisit both the usage of AI tools and attitudes toward them. Our goal is not to promote or dismiss AI, but to understand, through evidence, where it genuinely improves UX research.
Tags: ExaminingMeasuringU

