How to replace words in a string using a dictionary mapping

Question:

I have the following sentence

a = "you don't need a dog"

and a dictionary

dict =  {"don't": "do not" }

But I can’t use the dictionary to map words in the sentence using the below code:

''.join(str(dict.get(word, word)) for word in a)

Output:

"you don't need a dog"

What am I doing wrong?

Asked By: A.Papa

||

Answers:

Here is one way.

a = "you don't need a dog"

d =  {"don't": "do not" }

res = ' '.join([d.get(i, i) for i in a.split()])

# 'you do not need a dog'

Explanation

  • Never name a variable after a class, e.g. use d instead of dict.
  • Use str.split to split by whitespace.
  • There is no need to wrap str around values which are already strings.
  • str.join works marginally better with a list comprehension versus a generator expression.
Answered By: jpp

You need to split(' ') your sentence on ' ' – if you simply iterate over a string, you iterate characters:

a = "you don't need a dog"

for word in a:  # thats what you are using as input to your dict-key-replace
    print(word) # the single characters are never matched, thats why yours does not work.

Output:

y
o
u

d
o
n
'
t

n
e
e
d

a

d
o
g

Read How to debug small programs

After that, read How to split a string into a list? or use jpp’s solution.

Answered By: Patrick Artner

You can use the flash library for keyword replacement.

Example:

dict =  {"don't": "do not" }

Don’t use the python reserved keyword as a variable name.

from flashtext import KeywordProcessor


dict_ =  {"don't": "do not" }
a     = "you don't need a dog"


def add_words(word_dict):
    keyword_processor = KeywordProcessor()
    for key, value in word_dict.items():
        keyword_processor.add_keyword(key, value)
    return keyword_processor


def flashtext_test(keyword_processor, sentence):
    new_sentence = keyword_processor.replace_keywords(sentence)
    return new_sentence



keyword_pro = add_words(dict_)
flashtext_test(keyword_pro, a)

output:

'you do not need a dog'
Answered By: Aaditya Ura

All answers are correct, but in case your sentence is quite long and the mapping-dictionary rather small, you should think of iterating over the items (key-value pairs) of the dictionary and apply str.replace to the original sentence.

The code as suggested by the others. It takes 6.35 µs per loop.

%%timeit

search = "you don't need a dog. but if you like dogs, you should think of getting one for your own. Or a cat?"
mapping =  {"don't": "do not" }

search = ' '.join([mapping.get(i, i) for i in search.split()])

Let’s try using str.replace instead. It takes 633 ns per loop.

%%timeit 

search = "you don't need a dog. but if you like dogs, you should think of getting one for your own. Or a cat?"
mapping =  {"don't": "do not" }

for key, value in mapping.items():
    search = search.replace(key, value)

And let’s use Python3 list comprehension. So we get the fastest version that takes 1.09 µs per loop.

%%timeit 

search = "you don't need a dog. but if you like dogs, you should think of getting one for your own. Or a cat?"
mapping =  {"don't": "do not" }

search = [search.replace(key, value) for key, value in mapping.items()][0]

You see the difference? For your short sentence the first and the third code are about the same speed. But the longer the sentence (search string) gets, the more obvious the difference in performance is.

Result string is:

‘you do not need a dog. but if you like dogs, you should think of getting one for your own. Or a cat?’

Remark: str.replace would also replace occurrences within long concatenated words. One needs to ensure that replacement is done for full words only. I guess there are options for str.replace. Another idea is using regular expressions as explained in this posting as they also take care of lower and upper cases. Trailing white spaces in your lookup dictionary won’t work since you won’t find occurrences at the beginning or on the end of a sentence.

Answered By: Matthias