I am deeply worried about my vacuuming skills. I’ve always enjoyed vacuuming, especially with the vacuum cleaner I use. It has a clear dustbin, and there’s something cathartic about running it over the carpet in the upstairs hallway and seeing all the dust and debris it collects. I’m worried, however, because I keep outsourcing my downstairs vacuuming to the robot vacuum cleaner my wife and I bought a while back. With three kids and three dogs in the house, our family room sees a lot of foot traffic, and I save a lot of time by letting the robot clean up. What am I losing by relying on my robot vacuum to keep my house clean?
Not much, of course, and I’m not actually worried about losing my vacuuming skills. Vacuuming the family room isn’t a task that means much to me, and I’m happy to let the robot handle it. Doing so frees up my time for other tasks, preferably bird-watching out the kitchen window, but more often doing the dishes, a chore for which I don’t have a robot to help me. It’s entirely reasonable for me to offload a task I don’t care much about to the machines when the machines are right there waiting to do the work for me.
That was my response to a new high-profile study from a MIT Media Lab team led by Nataliya Kosmyna. Their preprint, “Your Brain on ChatGPT: Accumulation of Cognitive Debt When Using an AI Assistant for Essay Writing Task,” details their experiment. The team enlisted 54 adult participants to write short essays using SAT prompts over multiple sessions. A third of the participants were given access to ChatGPT to help with their essay writing, a third had access to any website they could reach through a Google search engine but were prohibited from using ChatGPT or other large language models and a third had no outside aids (the “brain-only” group). The researchers not only scored the quality of the participants’ essays, but they also used electroencephalography to record participants’ brain activity during these writing tasks.
The MIT team found that “brain connectivity systematically scaled down with the amount of external support.” While the brain-only group “exhibited the strongest, widest‑ranging [neural] networks,” AI assistance in the experiment “elicited the weakest overall coupling.” Moreover, the ChatGPT users were increasingly less engaged in the writing process over the multiple sessions, often just copying and pasting from the AI chat bot by the end of the experiment. They also had a harder time quoting anything from the essay they had just submitted compared to the brain-only group.
This study has inspired some dramatic headlines: “ChatGPT May Be Eroding Critical Thinking Skills” and “Study: Using AI Could Cost You Brainpower” and “Your Reliance on ChatGPT Might Be Really Bad for Your Brain.” Savvy news readers will key into the qualifiers in those headlines (“may,” “could,” “might”) instead of the scarier words, and the authors of the study have made an effort to prevent journalists and commentators from overplaying their results. From the study’s FAQ: “Is it safe to say that LLMs are, in essence, making us ‘dumber’? No!” As is usually the case in the AI-and-learning discourse, we need to slow our roll and look beyond the hyperbole to see what this new study does and doesn’t actually say.
I should state now for the record that I am not a neuroscientist. I can’t weigh in with any authority on the EEG analysis in this study, although others with expertise in this area have done so and have expressed concerns about the authors’ interpretation of EEG data. I do, however, know a thing or two about teaching and learning in higher education, having spent my career at university centers for teaching and learning helping faculty and other instructors across the disciplines explore and adopt evidence-based teaching practices. And it’s the teaching-and-learning context in the MIT study that caught my eye.
Consider the task that participants in this study, all students or staff at Boston-area universities, were given. They were presented with three SAT essay prompts and asked to select one. They were then given 20 minutes to write an essay in response to their chosen prompt, while wearing an EEG helmet of some kind. Each subject participated in a session like this three times over the course of a few months. Should we be surprised that the participants who had access to ChatGPT increasingly outsourced their writing to the AI chat bot? And that, in doing so, they were less and less engaged in the writing process?
I think the takeaway from this study is that if you give adults an entirely inauthentic task and access to ChatGPT, they’ll let the robot do the work and save their energy for something else. It’s a reasonable and perhaps cognitively efficient thing to do. Just like I let my robot vacuum cleaner tidy up my family room while I do the dishes or look for an eastern wood pewee in my backyard.
Sure, writing an SAT essay is a cognitively complex task, and it is perhaps an important skill for a certain cohort of high school students. But what this study shows is what generative AI has been showing higher ed since ChatGPT launched in 2022: When we ask students to do things that are neither interesting nor relevant to their personal or professional lives, they look for shortcuts.
John Warner, an Inside Higher Ed contributor and author of More Than Words: How to Think About Writing in the Age of AI (Basic Books), wrote about this notion in his very first post about ChatGPT in December 2022. He noted concerns that ChatGPT would lead to the end of high school English, and then asked, “What does it say about what we ask students to do in school that we assume they will do whatever they can to avoid it?”
What’s surprising to me about the new MIT study is that we are more than two years into the ChatGPT era and we’re still trying to assess the impact of generative AI on learning by studying how people respond to boring essay assignments. Why not explore how students use AI during more authentic learning tasks? Like law students drafting contracts and client memos or composition students designing multimodal projects or communications students attempting impossible persuasive tasks? We know that more authentic assignments motivate deeper engagement and learning, so why not turn students loose on those assignments and then see what impact AI use might have?
There’s another, more subtle issue with the discourse around generative AI in learning that we can see in this study. In the “Limitations and Future Work” section of the preprint, the authors write, “We did not divide our essay writing task into subtasks like idea generation, writing, and so on.” Writing an essay is a more complicated cognitive process than vacuuming my family room, but critiques of the use of AI in writing are often focused on outsourcing the entire writing process to a chat bot. That seems to be what the participants did in this study, and it is perhaps a natural use of AI when given an uninteresting task.
However, when a task is interesting and relevant, we’re not likely to hand it off entirely to ChatGPT. Savvy AI users might get a little AI help with parts of the task, like generating examples or imagining different audiences or tightening our prose. AI can’t do all the things that a trained human editor can, but, as writing instructor (and human editor) Heidi Nobles has argued, AI can be a useful substitute when a human editor isn’t readily available. It’s a stretch to say that my robot vacuum cleaner and I collaborate to keep the house tidy, but it’s reasonable to think that someone invested in a complex activity like writing might use generative AI as what Ethan Mollick calls a “co-intelligence.”
If we’re going to better understand generative AI’s impact on learning, something that will be critical for higher education to do to keep its teaching mission relevant, we have to look at the best uses of AI and the best kinds of learning activities. That research is happening, thankfully, but we shouldn’t expect simple answers. After all, learning is more complicated than vacuuming.