Creating function that makes a dictionary from a list

Question:

The goal -> For each word in the text except the last one, a key should appear in the resulting dictionary, and the corresponding value should be a list of every word that occurs immediately after the key word in the text. Repeated words should have multiple values:
example:

fun(["ONE", "two", "one", "three"]) == 
            {"one": ["two", "three"],"two": ["one] })

what I have so far:

def build_predictions(words: list) -> dict:
  dictionary = {}
  for word in words:
    if word.index() != words.len():
      if word not in dictionary:
        dictionary.update({word : words(words.index(word)+1)})
      else:
        dictionary[word] = dictionary[word] + [words(words.index(word)+1)]

Im getting an EOF error ;[ -> not sure if this is right anyways.

Asked By: number2patrician

||

Answers:

Your code can’t give you an EOF error, since you don’t do any file reading in the code you’ve shown. Since you haven’t shown any of that code, I can’t help you with the EOF error. However, there are a bunch of things wrong with your approach to make your dictionary of predictions:

  1. word.index() is not a thing. If you want the index of word in words, iterate using for index, word in enumerate(words)
  2. words.len() is not a thing. You can get the length of words by len(words)
  3. The current word’s index is never equal to the length of the list, since list indices start at 0 and go to len(lst) - 1. Your if condition should have been if index < len(words) - 1
    1 .You don’t need to do this check at all if you simply change your loop to for word in words[:-1], which will skip the last word.
  4. if word not in dictionary, you want to create a new list that contains the next word.
  5. If word is in dictionary, you want to append that word to the list instead of creating a new list by concatenating the two.
  6. You need to return dictionary from your function.

Incorporating all these suggestions, your code would look like so:

def build_predictions(words: list) -> dict:
    dictionary = {}
    for index, word in enumerate(words[:-1]):
        next_word = words[index + 1]
        if word not in dictionary:
            dictionary[word] = [next_word]
        else:
            dictionary[word].append(next_word)
    return dictionary

Now, if you only want unique words, you can just create a dictionary containing sets instead of lists. That way, when you .add() to the set, it won’t have any effect if the set already contains the word you want to add.

def build_predictions(words: list) -> dict:
    dictionary = {}
    for index, word in enumerate(words[:-1]):
        next_word = words[index + 1]
        if word not in dictionary:
            dictionary[word] = {next_word}     # Creates a set containing next_word
        else:
            dictionary[word].add(next_word)
    return dictionary

At the end of this, if you want to convert the sets back to lists, it’s easy enough to do so. Instead of return dictionary, do:

    return {k: list(v) for k, v in dictionary.items()}

We can remove the need to check if word in dictionary by using a collections.defaultdict

We can zip two slices of the list of words: one that goes from the start to the second-to-last item, and one that goes from the second item to the last item. Iterating over the zip of the two slices will give us the current word and the next word in each iteration.

Then, we can just collect these in a defaultdict(list) or a defaultdict(set).

from collections import defaultdict

def build_predictions(words: list) -> dict:
    predictions = defaultdict(list)
    # or        = defaultdict(set)
    for word, next_word in zip(words[:-1], words[1:]):
        predictions[word].append(next_word)
        # or             .add(next_word)

    return predictions
    # or   {k: list(v) for k, v in predictions.items()}

Answered By: Pranav Hosangadi

First of all, you should not use index because it return only the index of the first occurence. This way it should work better

  for i in range(len(words)-1):
    word = world[i]
Answered By: Claude Shannon
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.