Summary

The website content discusses methods for educators to detect AI-generated text, specifically from OpenAI's ChatGPT, to ensure academic integrity in student assignments.

Abstract

The article addresses the challenge educators face in distinguishing between student-written and AI-generated essays, particularly since the release of OpenAI's ChatGPT. It outlines several tools available for detecting AI-generated text, including OpenAI's AI Text Classifier, GPT-2 Output Detector Demo, GPTZeroX by Edward Tian, and DetectGPT from Stanford University. Each tool offers a different approach to detection, such as analyzing text characteristics like perplexity and burstiness, or using a language model to identify its own outputs. The article emphasizes the importance of using multiple tools for more accurate detection, as none of the tools are 100% reliable on their own. It also notes that while these tools are useful, they are not foolproof, reflecting the broader truth in AI that "All models are wrong, but some are useful," as stated by George E. P. Box.

Opinions

The author acknowledges the high quality of ChatGPT's output, suggesting it could be used extensively by students for assignments.
The author believes that OpenAI is responsive to educators' concerns, evidenced by their publication of considerations for ChatGPT in educational settings.
The author suggests that a combination of tools should be used to detect AI-generated text due to the lack of absolute accuracy in any single tool.
The author expresses a personal sentiment, wishing that such AI tools were available during their own time as a student.
The author highlights the experimental nature of DetectGPT, noting its current limitation to detecting text from GPT-2 and the potential for future expansion to include larger models.

How to Detect OpenAI’s ChatGPT Output

How to detect if the student used OpenAI’s ChatGPT to complete an assignment

On November 30, 2022, OpenAI released ‘ChatGPT’ AI system (https://openai.com/blog/chatgpt/), which is a universal writer’s assistant that can generate a variety of output, including school assignments. The output (e.g., essays) provided by ChatGPT is so good, if I was a student, I would be using ChatGPT to complete most of my school assignment with minor revisions.

This results in a dilemma for educators where it is very difficult to discern if the student wrote the essay or ChatGPT wrote the essay. They would need some kind of tool to check this. For example, if the teacher assigns homework on the importance of the Monore Doctrine. A student can utilize ChatGPT to write an essay on Monroe Doctrine:

The Monroe Doctrine was a foreign policy statement issued by President James Monroe in 1823. It declared that the United States would not interfere in the affairs of European colonizers, and that any attempts by European powers to colonize or interfere with independent states in the Americas would be seen as a threat to the United States.

…..

The good news is that OpenAI is aware of the concerns expressed by educators, and it has published Educator considerations for ChatGPT. Additionally, there are currently available tools for detecting if the text was generated by AI.

OpenAI AI Text Classifier
OpenAI GPT-2 Output Detector Demo
GPTZeroX by Edward Tian (Princeton University)
DetectGPT by Stanford University

You may want to use all four tools to detect if the text was generated by AI since none of these tools are even close to 100% accurate.

OpenAI AI Text Classifier

OpenAI released AI Text Classifier (https://openai.com/blog/new-ai-classifier-for-indicating-ai-written-text/) on January 31, 2023. The AI Text Classifier is a fine-tuned GPT model that predicts how likely it is that a piece of text was generated by AI from a variety of sources, such as ChatGPT.

You can access the AI Text Classifier by navigating to https://platform.openai.com/ai-text-classifier and signing into the website using your OpenAI ChatGPT account. To demonstrate the tool, I have copied and pasted the above essay and a bit more, since it requires more than 1,000 characters, as shown below:

The tool has determined that this text was likely AI-generated.

OpenAI GPT-2 Output Detector Demo

OpenAI’s tool is hosted on Hugging Face (https://huggingface.co) and it is called GPT-2 Output Detector Demo (https://huggingface.co/openai-detector) that was developed by OpenAI (see details here => https://huggingface.co/roberta-base-openai-detector).

You can access the GPT-2 Output Detector Demo by navigating to https://huggingface.co/openai-detector. To demonstrate the tool, I have copied and pasted the above essay, as shown below:

The tool has determined that there is a 99.61% probability this text was generated using OpenAI GPT.

GPTZeroX

Edward Tian (Princeton) updated a tool to GPTZeroX (http://gptzero.me/) on January 29, 2023; previously released on January 2, 2023, as GPTZero. The tool looks for both “perplexity” and “burstiness.” Perplexity measures how likely each word is to be suggested by a bot; a human would be more random. Burstiness measures the spikes in the perplexity of each sentence. A bot will likely have a similar degree of perplexity from sentence to sentence, but a human is likely to write with spikes — maybe one long, complex sentence followed by a shorter one.

You can access GPTZeroX by navigating to http://gptzero.me/. To demonstrate the tool, I have copied and pasted the above essay and a bit more as shown below:

The tool has determined that this text was likely AI-generated. In addition, GPTZeroX provides both Perplexity Score and Burstiness Score.

DetectGPT

Stanford University released DetectGPT (https://detectgpt.ericmitchell.ai/) on January 31, 2023. DetectGPT is a general-purpose method for using a language model to detect its own generations; however, this proof-of-concept only detects if a particular piece of text came from GPT-2. Detections on samples from other models may be particularly unreliable. We may add larger models like GPT-J (6B), GPT-NeoX (20B), or GPT-3 (175B) in the future; we perform evaluations with these and other models in our paper per the DetectGPT website.

You can access DetectGPT by navigating to https://detectgpt.ericmitchell.ai/. To demonstrate the tool, I have copied and pasted the above essay and a bit more as shown below:

Please note that I had to keep the text to under 200 words, so as not to overheat their GPUs per website.

The tool has determined that this text is likely to be from GPT-2. In addition, DetectGPT provides perturbed texts.

Other Tools

There are two more tools for detecting if the text was generated by AI, which I have not tried:

Writer AI Content Detector (https://writer.com/ai-content-detector/)
Content at Scale AI Content Detection (https://contentatscale.ai/ai-content-detector/)

…..

Go try out all four tools! and try to keep the students from cheating. I do wish I had tools like ChatGPT when I was a student, though.

Please note that these tools like everything in AI, have a high probability of detecting AI-generated text output, but not 100% as attributed by George E. P. Box “All models are wrong, but some are useful”.

I hope you have enjoyed this tutorial. If you have any questions or comments, please provide them here.

Resources

The following is a list of resources used or referenced in this tutorial:

Educator considerations for ChatGPT (https://platform.openai.com/docs/chatgpt-education)
AI Text Classifier (https://platform.openai.com/ai-text-classifier)
Hugging Face (https://huggingface.co/)
GPT-2 Output Detector Demo (https://huggingface.co/openai-detector)
Details on GPT-2 Output Detector (https://huggingface.co/roberta-base-openai-detector)
https://www.newscientist.com/article/2350655-openai-is-developing-a-watermark-to-identify-work-from-its-gpt-text-ai/
GptZeroX: http://gptzero.me/
OpenAI chatGPT (https://chat.openai.com/chat)
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature (https://ericmitchell.ai/detectgpt/)
Detecting GPT-2 Generations with DetectGPT (https://detectgpt.ericmitchell.ai/)