Find duplicated word in string and in which two positions it appears
Question:
I am fighting a bit with this problem. I want to write a code, which will find a duplicate word in a string from an input and print that word. Along with the word I also want to print the word positions in the given string.
This is my code so far, but on the output I still get the word repeated. Any ideas how to fix it?
Input: juice bread tea water apple tea carrot coconut
Output – desired: tea 1 4
string = str(input())
string_list = string.split(' ')
for j in range(len(string_list)):
duplicate = string_list.count(string_list[j])
if duplicate > 1:
print((string_list[j]), (j), end=' ')
Output – current: tea 2 tea 5
Answers:
string = str(input())
string_list = string.split(' ')
duplicates = [word for word in string_list if string_list.count(word) > 1]
unique_duplicates = list(set(duplicates))
occurences = list()
for index, elem in enumerate(string_list):
if elem == unique_duplicates[0]:
occurences.append(index)
print(f"{unique_duplicates[0]} {occurences[0]} {occurences[1]}")
I want to write a code, which will find a duplicate word in a string from an input and print that word. Along with the word I also want to print the word positions in the given string.
Try the below
from collections import defaultdict
data = defaultdict(list)
_input = 'juice bread tea water apple tea carrot coconut'
words = _input.split(' ')
for idx, word in enumerate(words):
data[word].append(idx)
for word, index_list in data.items():
if len(index_list) > 1:
print(f'{word} -> {index_list}')
output
tea -> [2, 5]
Assuming there’s always exactly one duplicated word, you can simply break the loop when you find it, then get the index of its next occurrence in the list.
def find_duplicate_word(string):
words = string.split(' ')
for first_occ, word in enumerate(words):
if words.count(word) > 1:
break
second_occ = words.index(word, first_occ+1)
return word, first_occ, second_occ
result = find_duplicate_word('juice bread tea water apple tea carrot coconut')
print(*result) # -> tea 2 5
Note that words.count()
has to do a search on every loop, which is fine for small inputs, but for bigger inputs, I’d use something more like balderman’s answer, with defaultdict(list)
. That would bring the solution from O(n^2) to O(n). Plus it’d work if the number of duplicates is not always one.
the task can be done also in this way:
txt = 'bread tea water apple tea carrot coconut juice'.split()
seen = []
for word in txt:
if txt.count(word) > 1 and word not in seen:
print(word, *[ind for ind, i in enumerate(txt) if i == word])
seen.append(word)
I am fighting a bit with this problem. I want to write a code, which will find a duplicate word in a string from an input and print that word. Along with the word I also want to print the word positions in the given string.
This is my code so far, but on the output I still get the word repeated. Any ideas how to fix it?
Input: juice bread tea water apple tea carrot coconut
Output – desired: tea 1 4
string = str(input())
string_list = string.split(' ')
for j in range(len(string_list)):
duplicate = string_list.count(string_list[j])
if duplicate > 1:
print((string_list[j]), (j), end=' ')
Output – current: tea 2 tea 5
string = str(input())
string_list = string.split(' ')
duplicates = [word for word in string_list if string_list.count(word) > 1]
unique_duplicates = list(set(duplicates))
occurences = list()
for index, elem in enumerate(string_list):
if elem == unique_duplicates[0]:
occurences.append(index)
print(f"{unique_duplicates[0]} {occurences[0]} {occurences[1]}")
I want to write a code, which will find a duplicate word in a string from an input and print that word. Along with the word I also want to print the word positions in the given string.
Try the below
from collections import defaultdict
data = defaultdict(list)
_input = 'juice bread tea water apple tea carrot coconut'
words = _input.split(' ')
for idx, word in enumerate(words):
data[word].append(idx)
for word, index_list in data.items():
if len(index_list) > 1:
print(f'{word} -> {index_list}')
output
tea -> [2, 5]
Assuming there’s always exactly one duplicated word, you can simply break the loop when you find it, then get the index of its next occurrence in the list.
def find_duplicate_word(string):
words = string.split(' ')
for first_occ, word in enumerate(words):
if words.count(word) > 1:
break
second_occ = words.index(word, first_occ+1)
return word, first_occ, second_occ
result = find_duplicate_word('juice bread tea water apple tea carrot coconut')
print(*result) # -> tea 2 5
Note that words.count()
has to do a search on every loop, which is fine for small inputs, but for bigger inputs, I’d use something more like balderman’s answer, with defaultdict(list)
. That would bring the solution from O(n^2) to O(n). Plus it’d work if the number of duplicates is not always one.
the task can be done also in this way:
txt = 'bread tea water apple tea carrot coconut juice'.split()
seen = []
for word in txt:
if txt.count(word) > 1 and word not in seen:
print(word, *[ind for ind, i in enumerate(txt) if i == word])
seen.append(word)