Anagram function returns False for unknown reason in some words – Python
Question:
I’ve tried to create a function that returns True if two words are anagrams (have the same letters) to each other.
I realized that my counter counts to 13 but not sure what is the issue.
def is_anagram (worda: str , wordb: str) -> bool:
count_same_letters = 0
if len(worda) == len(wordb):
for i in worda:
for j in wordb:
if i == j:
count_same_letters = count_same_letters+1
print(count_same_letters)
print(len(wordb))
return count_same_letters == len(wordb)
else:
return False
print(is_anagram("anagram","nagaram"))
while trying the string ‘abc’ abd ‘bca’ the output was True as I expected, but the strings ‘anagram’ and ‘nagaram’returns False
Answers:
The issue is that your code has counted the same letter multiple times. Try to run your code here – https://pythontutor.com/ and see each step-by-step the counts.
You could simplify it by using collections.Counter() in standard lib to make the faster counts and compare with each letters.
def is_anagram(str1, str2):
a = str1.lower() # if don't care case, can skip these steps
b = str2.lower()
return Counter(a) == Counter(b)
if __name__ == '__main__':
string1 = 'funeral'
string2 = 'realFun'
print(is_anagram('anagram', 'nagaram')) # True
print(is_anagram(string1, string2)) # True
print(is_anagram('abbac', 'cbbac')) # False
You are going through each pair of letters.
For instance if you have two 5-letter words, you’re going through 25 pairs of letters. If you input aaaab
and baaaa
, your count_same_letters
counter will get to 4*4 + 1*1
(4*4 pairs of a
‘s and 1*1 pairs of b
‘s).
Change your algorithm.
As mentioned, the problem here is that letters occurring more than once are counted again. Supposing you need to implement the algorithm yourself without Count
, here’s one approach to the problem:
def is_anagram (worda: str , wordb: str) -> bool:
if len(worda) == len(wordb):
remaining_letters = wordb
for letter in worda:
letter_index = remaining_letters.find(letter)
if letter_index == -1:
return False # no! not anagram
remaining_letters = remaining_letters[:letter_index] + remaining_letters[letter_index+1:]
if len(remaining_letters) == 0:
return True
else:
return False
print(is_anagram("anagram", "gramnaa"))
print(is_anagram("omg", "omggg"))
print(is_anagram("abc", "cba"))
print(is_anagram("aaaa", "aaaaaa"))
By "deleting" every letter we find from remaining_letters
our code will have to search through a smaller string on each iteration, possibly making it perform faster then some other alternatives.
I’ve tried to create a function that returns True if two words are anagrams (have the same letters) to each other.
I realized that my counter counts to 13 but not sure what is the issue.
def is_anagram (worda: str , wordb: str) -> bool:
count_same_letters = 0
if len(worda) == len(wordb):
for i in worda:
for j in wordb:
if i == j:
count_same_letters = count_same_letters+1
print(count_same_letters)
print(len(wordb))
return count_same_letters == len(wordb)
else:
return False
print(is_anagram("anagram","nagaram"))
while trying the string ‘abc’ abd ‘bca’ the output was True as I expected, but the strings ‘anagram’ and ‘nagaram’returns False
The issue is that your code has counted the same letter multiple times. Try to run your code here – https://pythontutor.com/ and see each step-by-step the counts.
You could simplify it by using collections.Counter() in standard lib to make the faster counts and compare with each letters.
def is_anagram(str1, str2):
a = str1.lower() # if don't care case, can skip these steps
b = str2.lower()
return Counter(a) == Counter(b)
if __name__ == '__main__':
string1 = 'funeral'
string2 = 'realFun'
print(is_anagram('anagram', 'nagaram')) # True
print(is_anagram(string1, string2)) # True
print(is_anagram('abbac', 'cbbac')) # False
You are going through each pair of letters.
For instance if you have two 5-letter words, you’re going through 25 pairs of letters. If you input aaaab
and baaaa
, your count_same_letters
counter will get to 4*4 + 1*1
(4*4 pairs of a
‘s and 1*1 pairs of b
‘s).
Change your algorithm.
As mentioned, the problem here is that letters occurring more than once are counted again. Supposing you need to implement the algorithm yourself without Count
, here’s one approach to the problem:
def is_anagram (worda: str , wordb: str) -> bool:
if len(worda) == len(wordb):
remaining_letters = wordb
for letter in worda:
letter_index = remaining_letters.find(letter)
if letter_index == -1:
return False # no! not anagram
remaining_letters = remaining_letters[:letter_index] + remaining_letters[letter_index+1:]
if len(remaining_letters) == 0:
return True
else:
return False
print(is_anagram("anagram", "gramnaa"))
print(is_anagram("omg", "omggg"))
print(is_anagram("abc", "cba"))
print(is_anagram("aaaa", "aaaaaa"))
By "deleting" every letter we find from remaining_letters
our code will have to search through a smaller string on each iteration, possibly making it perform faster then some other alternatives.