Research

The human half of human-AI interaction is where the magic happens. My work investigates the design of AI assistants and the magical thinking users exhibit in their interactions with them. I showed how partial automation can make video games more accessible, that visual indicators of AI intent can improve trust calibration, and why nonetheless users develop superstitions about how to make the AI do what they want. Now I want to figure out how to make AI that does that; without users resorting to magic.

Much of my PhD research was conducted using video games, which provide a low-stakes environment ideal for probing users' beliefs about how the AI works during sustained interactions. I explored how AI that shares control with human players can make video games more accessible to people with disabilities. Human-AI shared control is an interaction paradigm in which human and AI jointly control a system. It appears not only in games, but also across surgery, semi-autonomous driving, mobility assistance, creativity tools, and telerobotics. My survey of 55 shared control systems from these six domains maps the forms this cooperation takes and provides a design vocabulary for comparing them.

Two Heads Are Better Than One: A Dimension Space for Unifying Human and Artificial Intelligence in Shared Control — Cimolino, Graham. CHI 2022. 34 citations.

Partial automation makes games accessible — and reveals confusion

Partial automation is one form of shared control: the AI controls some inputs while the player controls the rest. In digital games, this can make play accessible to people who cannot operate all of a game’s controls. A study of six participants with spinal cord injury — ranging from minor motor impairments to complete paralysis below the neck — found that partial automation enabled all of them to engage meaningfully with both games tested.

The Role of Partial Automation in Increasing the Accessibility of Digital Games — Cimolino, Askari, Graham. PACMHCI (CHI PLAY), 2021. 26 citations.

Participants told me that accessible games did more than make exercise tolerable: they transformed participants from passive recipients of therapy into active participants with goals, competition, and simultaneous physical and cognitive engagement. They believed that playing exercise games with AI assistance may have helped them to overcome the negativity they felt during rehabilitation.

“It allows you to think that you can still do something that, to be honest, you never thought that you'd be able to do again.”

Beyond Fun: Players’ Experiences of Accessible Rehabilitation Gaming for Spinal Cord Injury — Cimolino, Askari, Graham. ASSETS 2021. 6 citations.

This study produced the observation that motivated my mental models research. One participant tilted his button controller to steer his AI-controlled avatar. The AI made all movement decisions; the tilt did nothing. He came to believe he had found a way to influence it. The automation was working exactly as designed, but the participant developed a false theory of how it worked and acted on it. I dedicated the rest of my PhD to understanding and solving this problem.

Communicating AI intentions improves trust but not performance

The obvious response to confusion about what the AI is doing is to tell users. If misattribution of control causes confusion, making the AI’s actions and intentions visible should reduce it. Ninja Showdown — a Rock-Paper-Scissors-inspired fighting game in which the player controls one attack and the AI controls the others — was used to test this with 150 Prolific participants. They played in one of three conditions: no information about AI actions (No cues), information about what the AI was currently doing (Action cues), or information about both what it was doing and what it intended to do next (Intention cues). Trust appropriateness was measured using Matthews correlation coefficient across all rounds played.

Intention cues significantly improved how appropriately participants trusted the AI (p < .05). Participants allowed the AI to act on its own when it would perform the winning action and they overrode it when it wouldn't. However, paradoxically, their scores in the game did not improve. More accurate understanding of the AI did not translate into better cooperation. This was the first indication that the problem ran deeper than missing information and that more focused study of automation confusion was needed.

Impact of Awareness Cues on Trust in Human-AI Shared Control — Cimolino, Gutwin, Graham. TRAIT Workshop at CHI 2022. 1 citation.

A grounded theory of automation confusion

To understand what was going wrong, I recruited ten non-gamer adults to play two partially automated games: the Ninja Showdown game from the awareness cues study and a one button adaptation of Spelunky. To produce metacognitive data that provide insights into participants' mental models of their AI assistants, I periodically prompted them to think aloud while playing. I also collected gameplay logs, eye tracking, and post-session interview data that I analyzed using grounded theory methodology. Both games were designed to be as transparent as possible: tutorials explained which actions the player controlled, and awareness cues displayed the AI's current and intended actions in real time. Nine of the ten participants were confused anyway.

The resulting theory describes four interacting categories: the sources of confusion, the mental model errors those sources produce, the behaviors that follow from those errors, and the attitudes that determine whether users are willing and actively trying to correct their understanding. My analyses tracked how participants' actions and observations shaped their mental models over the course of play. I organized these sequences of events into confusion episodes that illustrate how the categories of my theory of automation confusion work together.

P2 believed she controlled Emi’s Bomb attack by pressing the ‘b’ key — an incorrect mapping, one of the theory’s sources of confusion, in which she invented an association between the first letter of the action’s name and the key that should control it. This produced an over-attribution error: she believed she caused an action the AI controlled. She pressed the key to confirm her hypothesis — a confirmation behavior. When Emi coincidentally used the Bomb, P2 interpreted the automated action as proof that her press had worked: a case of misinterpreted feedback that reinforced the original error.

Three panels are shown and each features text explaining what they depict. Below this text, boxes naming the concept and category that these panels illustrate are linked by arrows, representing how each affects the next. In the first panel, Emi intends to use the Bomb and Takeshi is using the Dart. The explanatory text reads: P2 believed she controlled the Bomb with the b key. The classification box reads: Over-Attribution is one of the Types of Mental Model Errors. In the second panel, Emi intends to use the Bomb and Takeshi is using the Dart. A bubble, representing P2 pressing the b key, is shown in the centre. The explanatory text reads: P2 pressed the b key to make Emi use the Bomb. The classification box reads: Confirmation is one of the Behaviours Resulting from Confusion. In the third panel, Emi is using the Bomb and Takeshi is using the Dart. The explanatory text reads: P2 misinterpreted Emi using the Bomb as feedback reinforcing her belief that she controlled the Bomb. The classification box reads: Misinterpreted Feedback is one of the Sources of Confusion.

P7 felt uninvolved because her avatar acted on its own. She concluded the optimal strategy was to press the spacebar as often as possible, believing this guaranteed a win. She began mashing — pressing with no specific intention — and stopped paying attention to what happened. She seldom noticed when she lost. Her emotional attitude produced behavior that ensured that she missed feedback that might have corrected her beliefs.

“It’s too boring. Like, if I kept just pressing the spacebar I’ll win in all the games, right? I don’t even need to see what she’s predicting. You don’t even need to hear, just press.”

P3 wanted her Spelunky avatar to move in the direction she intended, but movement was controlled entirely by the AI. Her desire to control what she could not — wishful thinking — led her to an extra-rule error: a belief that pressing on different sides of the spacebar would steer the avatar left or right. She engaged in manner modification, physically pressing on the left or right side of the key to signal her intentions to a system that was insensitive to these differences. Sometimes the avatar went the way she wanted. She asked whether there was a connection.

“In my mind, I’m feeling if I [press on the left] it will go left and if I [press on the right] it will go right.”

These three cases — a false hypothesis confirmed by coincidence, confusion so complete that failure goes unnoticed, and a superstition drawn from desire — illustrate how the theory’s categories interact. Users’ desires and expectations produce behaviors that generate feedback that creates and reinforces mental model errors. Attitudes determine whether users attempt to improve their understanding and how they engage with feedback. Users who cannot make the AI do what they want do not always conclude that the AI is incapable. If they have need and are persistent, they discover methods that may only appear to have the desired effects.

“Unless I have opportunities I don’t know about… If she’s not trustworthy then I’m like ‘uh, am I missing something in the game? Is there another key that I have access to that I don’t know about?’”

Automation Confusion: A Grounded Theory of Non-Gamers’ Confusion in Partially Automated Action Games — Cimolino, Chen, Gutwin, Graham. CHI 2023. 8 citations.

Similar mechanisms appear in generative AI

The categories that constitute automation confusion are not specific to games or shared control. A study adapted Stable Diffusion into a game in which players tried to generate images of Superman without using the word “Superman.” Players crafted prompts based on what they believed the AI should know — drawing on their own knowledge of the character’s costume, cultural references, and portraying actors to hypothesize what the model would recognize. When prompts failed, some players blamed deficiencies in the AI rather than revise their model of how it worked. Other players collaboratively engineered their prompts; they discovered what the AI seemed to know and hypothesized that prompt lengths and the types of information in their prompts affected the AI's output.

“Henry Cavill wearing blue super suit with a big s on the chest and a red cape”

The wishful thinking, inherited expectations, and incorrect mappings that produced confusion in partially automated games may also operate in users’ reasoning about the knowledge and capabilities of generative models and how to invoke them. The domain changes; the mechanisms may not.

Playing with Dezgo: Adapting Human-AI Interaction to the Context of Play — Villareale, Cimolino, Gomme. FDG 2023. 10 citations.

Where this points

My PhD research established a grounded theory of automation confusion: how users form false beliefs about AI systems despite corrective information. This work has led me to believe that the problem isn't that users misunderstand their AI assistants, but that AI lacks the intelligence needed to understand human users.

I now want to engineer AI capable of empathy — the ability to understand users deeply, detect confusion latent in users' inputs, and act in ways that are faithful to users' unstated needs. This moves beyond processing user input to modeling user cognition: recognizing when input is incomplete, predicting what the user actually wants rather than what they ask for, and integrating these models into production AI systems.

I am not a designer. I apply scientific knowledge to the construction of AI systems. My background in human-AI interaction, AI engineering, and cognitive science is directly applicable to AI research roles where I will deploy these systems in production — making empathetic AI a standard component of production systems.