How to use GPT-3 for fill-mask tasks?

Question:

I use the following code to get the most likely replacements for a masked word:

!pip install git+https://github.com/huggingface/transformers.git
import torch
import pandas as pd
from transformers import AutoModelForMaskedLM, AutoTokenizer, pipeline

unmasker = pipeline('fill-mask', model='bert-base-uncased', top_k=100)
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
model = AutoModelForMaskedLM.from_pretrained('bert-base-uncased')

results = unmasker(f"The sun is [MASK].")
for i in results:
  print(i["token_str"], i["score"]*100)

For example, the most likely replacement for "[MASK]" in the sequence "The sun is [MASK]." is "rising" (33.61%), "shining" (9.33%), and "up" (7.38%).

My question: is there a way to achieve the same with GPT-3? There is a "complete" and "insert" preset in the OpenAI playground, however, it gives me full sentences (instead of single words) and no probabilities. Can someone help?

Asked By: diggi2395

||

Answers:

First of all, I don’t think you can access properties like token or scores in GPT-3, all you have is the generated text.

Second of all, in my experience GPT-3 is ALL about the correct prompt. You just have to give it instructions like you were talking to a human being.

In you specific case, I would use a prompt like this:

Prompt:

The sun is [MASK].

Replace [MASK] with the most probable 5 words to replace, and give me
their probabilities.

Result:

The sun is shining.

  1. shining – 0.47
  2. bright – 0.18
  3. sunny – 0.13
  4. hot – 0.10
  5. beautiful – 0.09

If you want to do that programmatically, here’s the code:

import openai
openai.organization = "your org key, if you have one"
openai.api_key = "you api key"
openai.Engine.list()

my_prompt = '''The sun is [MASK].
    
    Replace [MASK] with the most probable 5 words to replace, and give me their probabilities.'''

# Here set parameters as you like
response = openai.Completion.create(
  engine="text-davinci-002",
  prompt=my_prompt,
  temperature=0,
  max_tokens=500,
  # top_p=1,
  # frequency_penalty=0.0,
  # presence_penalty=0.0,
  # stop=["n"]
)

print(response['choices'][0]['text'])
Answered By: SilentCloud
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.