GPT-3: Unsupervised Autoregressive Language Model

June 6, 2020, 10:15 a.m. By: Merlyn Shelley


OpenAI recently released a GitHub repo with test cases on a robust NLP model GPT-3, that claim to improve task-agnostic, few-shot performance, and a lot more competing model in state-of-the-art fine-tuning techniques.

GUID Partition Table (GPT) - 3 is an unsupervised autoregressive language model that scales up the performance of the contemporary natural language processing models. After the success of BERT, Open AI have ventured into pre-training a successor model with 175 billion parameters and 350 GB memory capacity, called GPT-3. It can perform 10 times more than any other sparse language models. GPT-3 can be tested in few-shot settings.

The mighty GPT-3 model is capable of handling a gamut of NLP datasets like question-answering system, translation, performing 3-digit arithmetic, cloze tasks, as well as tasks that require fast reasoning and domain adaptation, like unscramble words, usage of a new word in a continuous stream of sentences. With great effort, this model is trained to produce outputs that match with human reasoning.

Now let's discuss on it's industry windfalls!

  • GPT-3 can perform 100x faster than it's the previous model GPT-2

  • While GPT-2 is trained with 1.5 billion parameters, GPT-3 is designed to handle 175 billion parameters which result in more accurate inferences.

  • GPT-3 is tested against generating news articles by employing human judges in Amazing Mechanical Turk to sort out the real ones among GPT-3 items. People could identify only 52% of the original content. GPT-3 generated were so efficient to resemble the original human-generated materials.

Reference: GitHub