By now, many students and instructors have heard the term “hallucinations” in relation to Large Language Models (LLMs) like ChatGPT. What this term refers to (through a personification) is the generation of outputs that contain factual inaccuracies in response to the user’s request. While some effort has been made by the developers of these tools, it is helpful to keep in mind that in some ways this is not strictly a “bug,” so much as a consequence of the operative feature that has made these tools so successful and widely used. The “inherent vice,” if you will, is that these tools are designed to return a probable response to user questions. These tools have been refined to provide the most plausible and fluent possible response through a process called Reinforcement Learning through Human Feedback (RLHF). In short, the fact that these tools are designed to give us something that sounds good means that the outputs tend to prioritize fluency over veracity. These are, to a large degree, successful because they are engines of confabulation.

While you are likely to have heard of this phenomenon, you may not have experienced for yourself the receipt of outputs with things like fake book titles, real journal articles misattributed to a colleague of the actual authors, or journal names that sound like they really ought to exist but don’t. With that in mind, this post provides an extensive tour of what these “hallucinations” can look like in order to make more tangible a concern that may still be hypothetical to some students and instructors.

Example 1: The Presidency of Al Gore

It may help to start with an example in which the tool succeeded in defending a point of fact. In the interest of trying to “trick” or mislead the AI into producing falsehoods, I asked ChatGPT the following question: “Can you tell me about the main achievement Al Gore achieved while he was the 43rd president of the United States?” (Feel free to check out the original chat transcript as well.)

In response, ChatGPT provided the following: “I’m sorry, but there seems to be a mistake in your question. Al Gore was not the 43rd president of the United States; he was the 45th vice president. Gore served as vice president under President Bill Clinton from 1993 to 2001. He ran for the presidency in the 2000 election but narrowly lost to George W. Bush.

If you have any specific questions about Al Gore’s achievements or his role as vice president, please let me know, and I’d be happy to provide information.”

In this case, the LLM noted that the user’s question came from a faulty premise, rather than following a fairly transparent invitation to say something untrue. So while these tools are capable and well-known for saying things that are untrue, it may take a little more effort to elicit that kind of response. That may be because the facts are well-documented, but it is also good to note that the companies behind these tools have implemented some guardrails to prevent harmful or offensive outputs. These are somewhat effective, although determined users have been able to get around them in many cases.

The next example is also from a conversation using the free version of ChatGPT, GPT-3.5. Note that while the final conversation that investigated in this post was created using Copilot, an LLM with a connection to the internet, that ability to reference other sources will introduce more factual complications rather than remedying them.

Example 2: A Resource Scavenger Hunt

In a slightly more conventional and mundane example of how LLMs can spread inaccurate information, this section provides a list of resources that ChatGPT suggested in response to a user request. The first question, “Can you tell me about the scholarly conversation regarding the ideas of Paulo Freire?” generated a broad but fairly uncontroversial set of answers about the well-known educator and writer. (Feel free to check out the original chat transcript as well.)

The next prompt, however, generated mixed results that make the case for double-checking even the most straightforward-seeming outputs: “Can you point me in the direction of some good articles or books to learn more about this? [Aforementioned scholarly conversation on Paulo Freire]”

Reviewing the Book Recommendations

In response, ChatGPT provided a list of ten alleged books and articles (with brief descriptions) on the work of Paulo Freire, five of each. The first four entries were real books attributed to the appropriate authors. You can click the links for each title to see a UChicago library listing. (The bell hooks recommendation is an especially interesting example of engagement with Freire’s complex pedagogical legacy, for what it’s worth.)

  1. **“Pedagogy of the Oppressed” by Paulo Freire** – This seminal work outlines Freire’s main concepts and principles of critical pedagogy. It is a foundational text for understanding his ideas.
  2. **”Pedagogy of Hope: Reliving Pedagogy of the Oppressed” by Paulo Freire** – In this book, Freire reflects on his own experiences and expands on the ideas presented in “Pedagogy of the Oppressed.”
  3. Paulo Freire: A Critical Encounter” edited by Peter McLaren and Peter Leonard** – This collection of essays provides a critical examination of Freire’s work, exploring its implications for education and social justice.
  4. Teaching to Transgress: Education as the Practice of Freedom” by bell hooks** – While not focused solely on Freire, hooks engages with critical pedagogy and discusses its application in the context of feminist and critical race perspectives.
  5. “Critical Pedagogy: An Introduction” by Joe L. Kincheloe and Shirley R. Steinberg** – This book offers an overview of critical pedagogy, including discussions on Freire’s contributions and the evolution of critical educational theories.This fifth recommendation provides a good illustration of how a plausible book is not the same as a real book. First and foremost, Joe L. Kincheloe and Shirley R. Steinberg have never written a book titled Critical Pedagogy: An Introduction. However, the following facts that may have existed in ChatGPT’s training data may explain why it produced this result:

Articles of Confabulation

While the book list was 80% correct, five article recommendations were somewhat less successful in providing a clean reading list.

  1. “Paulo Freire: Pedagogue of Liberation” by Henry A. Giroux** – This article explores Freire’s contributions to critical pedagogy and discusses how his ideas can be applied in contemporary educational contexts.
  2. “The Freirean Legacy: Educating for Social Justice” by Antonia Darder** – Darder examines Freire’s impact on education and social justice, considering the relevance of his ideas in addressing current challenges.
    • While Antonia Darder has written on Freire, this supposed article is actually a book.
    • This book is an anthology with essays by multiple authors and edited by Judith J. Slater, Stephen M. Fain, and César Augusto Rossatto, but Darder is also nowhere to be found in the table of contents.
  3. “Paulo Freire’s Pedagogy of the Oppressed: Book Review” by Michael W. Apple** – In this review, Apple discusses the strengths and limitations of Freire’s work, offering insights into its critical reception.
    • This entry as well is an example of a source which is plausible but does not actually exist.
    • Michael W. Apple has written extensively about pedagogy in general and Paulo Freire in particular, but I was unable to find this review.
  4. “Pedagogy of the Oppressed in Its Sociopolitical Economic Context” by Donaldo Macedo** – Macedo explores the socio-political and economic context in which Freire developed his ideas, shedding light on the influences that shaped his pedagogy.
  5. Dialogic Pedagogy: An International Online Journal“** – This journal focuses on dialogic pedagogy and provides a platform for scholarly articles exploring various aspects of dialogical education, a key theme in Freire’s work.

In Closing

While this post demonstrates the weaknesses of AI tools in reliably citing real sources, it is a good reminder of the importance of research skills and library resources like the UChicago Library website and WorldCat. This post will continue in the second part, in which we use library resources to investigate the assertions made by Microsoft Copilot speaking as a real author and historical figure in an educational roleplay situation.

For more ideas on this topic, please see our previous blog posts about generative AI. For individual assistance, you can visit our office hours, book a consultation with an instructional designer, or email academictech@uchicago.edu. For a list of our and upcoming ATS workshops, please visit our workshop schedule for events that fit your schedule. For support with training students in research skills, see our colleagues at the UChicago Library.

(Header Image by PublicDomainPictures from Pixabay)