How to remove repeated sentences from a string

Question:

I have an issue that I do not know how to tackle.

For example: I have a string returning in a function that has multiple sentences separatade by a comma. And some of them are comming repeated:

Like:

"lorem ipsum dolor, lorem ipsum dolor, lorem ipsum dolor"

I need to remove these sentences that are comming repeated but without checking word-by-word, rather sentence by sentence striped by ",". Since there may have other sentences with repeated words that should not be removed.

Input example:

"lorem ipsum dolor, lorem ipsum dolor, lorem mark dol"

Output desired:

"lorem ipsum dolor, lorem mark dol"

Asked By: Elias Prado

||

Answers:

This solution is based on the Tim Roberts comment. The only difference is OrderedDict usage in order to preserve sentences order:

from collections import OrderedDict

string = 'lorem ipsum dolor, lorem ipsum dolor, lorem mark dol'
string = ', '.join(OrderedDict.fromkeys(string.split(', ')))
print(string)

Output:

lorem ipsum dolor, lorem mark dol
Answered By: Alderven

This solution is not based on Tim Roberts comment but utilizes the same tools:

text = "lorem ipsum dolor, lorem ipsum dolor, lorem mark dol"
text = ', '.join(set(list(map(lambda s: s.strip(), text.split(",")))))

The difference with Alderven’s answer is no imports.

Answered By: artem

since python 3.6 the dict class keeps the items ordered. so we can also use regular dict, no additional modul is required.
the code splits by ‘, ‘ and also strips off all leading or trailing whitespaces.

txt = "lorem ipsum dolor, lorem ipsum dolor , lorem mark dol"
my_dict = dict.fromkeys(map(str.strip, txt.split(',')))
print(*my_dict, sep=', ')

result is:
lorem ipsum dolor, lorem mark dol

Answered By: lroth
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.