avatarUniqtech

Summarize

OpenAI GPT-3 Past, Present and Future of AI and NLP

Why is GPT-3 capturing the attention of the world?

Every one is talking about the mighty, great, futuristic language model by OpenAI, founded by Tesla CEO Elon Musk, Y Combinator partner Sam Altman and other Silicon Valley big shots like Google researchers and ex CTO of Stripe. It is truly eye opening. We also want to tell you how exciting it is. Why is GPT-3 so hyped right now? Probably because GPT-3 has the coolest video demos ever : based on just a few English sentences it can generate a TODO app (write code by itself), generate Excel spreadsheets, automatically translate, generate quizzes based on content. Mind boggling.

Every one is writing about GPT-3, but because we are technical, our article will give you the important technical details and background you need to understand OpenAI’s GPT-3.

Support us? Clap. Question? Message Us Here http://bit.ly/umessage.

Background stories

Need a primer on Natural Language Processing (NLP) basics? We got you covered, read our NLP cheatsheet article.

GPT-3 can write (generate) lyrics, scripts, code, data, articles, answers, summary, apps … and much more. But is it true intelligence? More about its shortfalls in our future article.

Summary — Present

Wikipedia:

  • Transformer language model and the successor to GPT-2
  • “OpenAI stated that full version of GPT-3 contains 175 billion parameters, two orders of magnitude larger than the 1.5 billion parameters in the full version of GPT-2 (although GPT-3 models with as few as 125 million parameters were also trained).” — wikipedia
  • “OpenAI stated that GPT-3 succeeds at certain “meta-learning” tasks. It can generalize the purpose of a single input-output pair. The paper gives an example of translation and cross-linguistic transfer learning between English and Romanian, and between English and German.”
  • “Training requires several thousand petaflop/s-days of compute, compared to tens of petaflop/s-days for the full GPT-2 model…Fully-trained ready to use if you have access to the API.”

Best GPT-3 Demo Videos

GPT-3 has the coolest video demos ever : based on just a few English sentences it can generate a TODO app (write code by itself), generate Excel spreadsheets, automatically translate, generate quizzes based on content. Mind boggling.

Two of the cool “translation” demos are translating legal documents into plain English, and translate plain English into Linux command line code! Wow.

Every one is talking about how great GPT-3 is. But don’t get too excited. GPT-3 is in private beta, and you will have to sign up to be waitlisted as of July 2020. We filled out an application. If we get to test drive it, will write more here.

The paper first came out in May 2020.

GPT-3 stands for Generative Pretrained Transformer 3rd generation. Its predecessor is GPT-2. It is a generative NLP model.

Type of model

  • Unsupervised language model
  • Language model
  • Generative
  • Few-shot learner
  • Transformer
  • Meaning the model is pre-trained and ready to go, just need a small number of sample data to get started.
  • Does not have to be trained from scratch.
  • Fairly intelligent out of the box.
  • No cold start problem.

Type of tasks: neural machine translation, question and answer model long answer, short answer, text generation, code generation, and the perhaps the most exciting content comprehension. Think of as the AI answering those reading comprehension questions in SAT and GRE that are super hard for humans, and some engineers :) That’s pretty smart even in the Turing test.

Zero shot — comes pre-trained, ready to use. One-shot, one sample is needed, animated portrait of Renaissance painting is a great example, as only one picture of Mona Lisa is available. Few shot a few examples is needed in the domain to generate relevant result.

Few shot learners (models) just need a few examples to “warm up” to train.

Support us? Clap. Question? Message Us Here http://bit.ly/umessage.

Paper release

Release date May 2020

https://arxiv.org/abs/2005.14165

Paid subscribers ✅ visit our private searchable ML blog/wiki to see important highlights from the paper.

Architecture and Usage

Few — shot learning illustrated (see blog for more details)

Size of the model

Parameter size is an estimate of how large the model is and how much compute power it needs

“successor to GPT-2 175 billion parameters, two orders of magnitude larger than the 1.5 billion parameters OpenAI stated that GPT-3 succeeds at certain “meta-learning” tasks. It can generalize the purpose of a single input-output pair. The paper gives an example of translation and cross-linguistic transfer learning between English and Romanian, and between English and German.” — wikipedia

Have Questions?

Support us? Clap. Question? Message Us Here http://bit.ly/umessage.

Support us

Support us by clapping for us. We write articles like this all the time. Most of them free.

Buy us coffee using buymeacoffee

☕️

☕️

Paid subscribers ✅, including coffee contributors get access to our private ML blog, wiki, searchable articles for just $5 / month.

Learn more 👇🏼

Support us? Clap. Question? Message Us Here http://bit.ly/umessage.

Ramifications — Future

Possibly the most exciting model of 2020. Many say the demo videos are looked at in awe. Some say this is a glimpse to the future. We think : this is a clear sign that machine learning, deep learning, artificial intelligence will become widely available, widely used by Data Engineers, Data Scientists, Data Analysts and even Business Users. AI will be democratized and used in low-code environments and setups. I.e., one does not need to code machine learning to use it. It will cause some loss of programmer jobs but it continues to affirm that our mission is correct, the future — ML education for all.

A word on GPT-2 — Past

Pre-trained system with transformer architecture like Google’s BERT and GPT-2, predecessor of GPT-3, have been leader of natural language process (NLP). They are pre-trained with famous and readily available natural language processing datasets. GPT-2 is considered one of the state-of-art models in the recent past.

How to test drive GPT-3

The model is available via an API, private access short application required because it is still in private beta. Putting it in private beta is a great way to get developer’s help to explore the APIs and its potential applications.

With the OPENAI_API_KEY you will be able to use curl in the command line to call the API to make predictions, generate output using the GPT-3 model. The screenshot below is an example using text completion model below:

GPT-3 in the News

Youtubers are raving about GPT-3 as the future of AI. Forbes thinks it is a bit hyped. MIT thinks . We agree with Forbes that it is amazing and over hyped. Because it is clearly a hot trending model, state-of-art and have many potential applications (hinted by the demo videos), but it is not magical, there are limitations. The model is pre-trained, training such a model requires a lot of time and compute. But we do think it will help democratize machine learning, putting models into hands of business users, data analysts (instead of just machine learnists and data scientists), we think it will inspire low-code apps using machine learning and AI.

Support us? Clap. Question? Message Us Here http://bit.ly/umessage.

AI
Deep Learning
Machine Learning
OpenAI
NLP
Recommended from ReadMedium