avatarJacob Ferus

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

2144

Abstract

b>. What does ChatGPT answer?</p><figure id="3600"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*J3MhzokaVvX2zWdq-xCa8w.png"><figcaption></figcaption></figure><figure id="b5b5"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*57JmfrNioCfQx0P9v1VOiw.png"><figcaption></figcaption></figure><figure id="260f"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*9wSqu-wu5WHd_KbINrACyw.png"><figcaption></figcaption></figure><p id="b0e6"><i>Not even close</i>. Let’s see what happens if I add clear instructions for what needs to be done:</p><figure id="cda8"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*h_tU1X5b934mAhxs6mJalw.png"><figcaption></figcaption></figure><p id="7b61">Now, it got it right. It should be easy for a human to solve this puzzle without any instructions since there aren’t that many options to choose from. ChatGPT seemingly can’t deduce this.</p><h1 id="16b2">The Reading Test</h1><p id="f88d">In this experiment, I decided to test if ChatGPT could make logical conclusions from a conversation:</p><figure id="28a4"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*hd4bmvegBCrwQwAvYjUYpQ.png"><figcaption></figcaption></figure><p id="ad3b">Since the initial statement says that one of the people he is talking to is his father, the only people it could be is Douglas or Josh. The first rather obvious clue is that Josh says “Good job son”. In addition:</p><ul><li>Josh says “tell your mother” and “we should eat them”</li><li>Douglas replies “I wish my family would eat fish tonight, my father is making pancakes”, implying that he is not part of the family that will eat fish.</li></ul><p id="ea10">Yet, asking ChatGPT several times, it always answered:</p><figure id="9a9f"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*Ou6KuKxlXfr0JOhIrg1icA.png"><figcaption></figcaption></figure><p id="ae10">Strangely, it persisted in guessing that Douglas was the father. I wonder why.</p><h1 id="764b">The Trick Question</h1><p id="8711">Next, I formulated a trick question:</p><figure id="c2ba"><img src="https://cdn-images

Options

-1.readmedium.com/v2/resize:fit:800/1*VVGou_zxIasdq4D95a4owQ.png"><figcaption></figcaption></figure><p id="a5cc">The trick is that after the first day the 10 coins will have disappeared and turned into 20 apples. Thereafter, there are no coins that can be transformed. Thus, the answer will still be 20 after 3 days. Can ChatGPT figure this out?</p><figure id="74f6"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*le4qxSR0LimvhxYaS_RKiA.png"><figcaption></figcaption></figure><figure id="9b46"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*LuxhwnIDa2BYHPd_YcCaBA.png"><figcaption></figcaption></figure><figure id="6a16"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*_m66l7zxII9ldxiiMxcNqw.png"><figcaption></figcaption></figure><p id="0d16">Out of 4 answers, all were wrong. Three of them were different (shown above). Arguably, the question could be seen as ambiguous. Thus, I tried to alter the question to make it more clear with formulations such as the following:</p><figure id="0500"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*nc5p4x6bGo1SDPa6d-iy7A.png"><figcaption></figcaption></figure><p id="3609">Asking it numerous times, none of the formulations caused it to always answer correctly. It answered correctly only about 20% of the time, thus, the majority of answers were incorrect.</p><h1 id="1ec7">Summary</h1><p id="b64b">While there has been a lot of talk about the prowess of ChatGPT, it’s also useful to understand its limitations. Clearly, from the examples, it’s still far from a human level.</p><p id="ef2e">What is interesting though, is that it sometimes can respond with incredibly sophisticated answers to difficult questions, yet other times fail at simple tasks. My guess is that ChatGPT is mostly interpolating, i.e. it can solve things well that it has seen or is close to what it has seen.</p><p id="7150">But asking novel questions, requiring extrapolation, i.e. questions outside of its training data, seems to cause it to fail more often. There is still a long way to go towards artificial general intelligence.</p></article></body>

AI

The Surprising Things ChatGPT Can’t Do (Yet)

Image generated by Jacob Ferus

At this point, most of us have seen amazing examples of ChatGPT and its abilities. Everyone is eager to find out what it can do. But to understand where you can and should use it, it is also necessary to look at its current limitations and where it is likely to fail. In this article, I’ve tasked ChatGPT with different questions to see what it cannot do. Let’s get into it:

The Puzzle

I previously had a few puzzles made for GPT-3 (prior to ChatGPT) using the command line:

To my disappointment, it failed one of the puzzles most of the time, and another it could not solve at all. Now, it is ChatGPT’s turn. I decided to test it on the puzzle that GPT-3 completely failed to solve. I also removed the part involving the command line and simply showed the parts of the puzzle immediately:

Here, mapping the number into letters should form the words data science. What does ChatGPT answer?

Not even close. Let’s see what happens if I add clear instructions for what needs to be done:

Now, it got it right. It should be easy for a human to solve this puzzle without any instructions since there aren’t that many options to choose from. ChatGPT seemingly can’t deduce this.

The Reading Test

In this experiment, I decided to test if ChatGPT could make logical conclusions from a conversation:

Since the initial statement says that one of the people he is talking to is his father, the only people it could be is Douglas or Josh. The first rather obvious clue is that Josh says “Good job son”. In addition:

  • Josh says “tell your mother” and “we should eat them”
  • Douglas replies “I wish my family would eat fish tonight, my father is making pancakes”, implying that he is not part of the family that will eat fish.

Yet, asking ChatGPT several times, it always answered:

Strangely, it persisted in guessing that Douglas was the father. I wonder why.

The Trick Question

Next, I formulated a trick question:

The trick is that after the first day the 10 coins will have disappeared and turned into 20 apples. Thereafter, there are no coins that can be transformed. Thus, the answer will still be 20 after 3 days. Can ChatGPT figure this out?

Out of 4 answers, all were wrong. Three of them were different (shown above). Arguably, the question could be seen as ambiguous. Thus, I tried to alter the question to make it more clear with formulations such as the following:

Asking it numerous times, none of the formulations caused it to always answer correctly. It answered correctly only about 20% of the time, thus, the majority of answers were incorrect.

Summary

While there has been a lot of talk about the prowess of ChatGPT, it’s also useful to understand its limitations. Clearly, from the examples, it’s still far from a human level.

What is interesting though, is that it sometimes can respond with incredibly sophisticated answers to difficult questions, yet other times fail at simple tasks. My guess is that ChatGPT is mostly interpolating, i.e. it can solve things well that it has seen or is close to what it has seen.

But asking novel questions, requiring extrapolation, i.e. questions outside of its training data, seems to cause it to fail more often. There is still a long way to go towards artificial general intelligence.

AI
Artficial Intelligence
Technology
Machine Learning
Data Science
Recommended from ReadMedium