How do I let Python know that these two words are the same?

Question:

I have a .csv file. Here is an example:

Table format:

A AA
BB B
C CC
D D D DD

Text format:

A,AA
BB,B
C,CC
D D,D DD

I want Python to know that A is equal to AA, BB is equal to B, and C is equal to CC. Also, the fourth example has spaces.

It can also be reversed, such as checking whether AA is equal to A.

What I’m working on is a search engine. A word may be written in two ways, so I need to do this.

For example, I have a boolean variable that checks if the search result is A and it returns AA, then it is True. Of course, returning A is also True.

My code:

query = … // "AA"

result_list = … // ["A"]

sorted_list = [element for element in result_list if element.find(query) != -1]

Feel free to leave a comment if you need more information.

How do I let Python know that these two words are the same?

Asked By: My Car

||

Answers:

Maybe it helps if you check for the hex code. Something like this:

input = "A AA"
hex_code=(":".join("{:02x}".format(ord(x)) for x in input))

if "41" in hex_code:
    print("True")
Answered By: melody

You can solve that with a simplified string at the left and right sides. For example, A will be simplified as A, AA be A, D D as D, etc. Try this :

import pandas as pd

def simplify(s):
  return "".join(set(s))

# data.csv
# A,AA
# BB,B
# C,CC
# D D,D DD

df = pd.read_csv('data.csv', header=None)
df = df.values.tolist()
result = []
for i in df:
  result.append(simplify(i[0]) == simplify(i[1]))
print(result)

The output must be like this :

[True, True, True, True]
Answered By: Jordy

Turn both strings into sets, add a space to one, and check if they match. Wrap that into function and use as you will:

def isequal_with_spaces(s1, s2):
    set1 = set(s1 + " ")
    return set1.issuperset(s2)


assert isequal_with_spaces('A', 'AA')
assert not isequal_with_spaces('A', 'BA')
assert isequal_with_spaces('A', 'A A')
assert isequal_with_spaces('AA', 'AAA A ')
assert not isequal_with_spaces('B', 'A A')

This doesn’t take into account case and might work not quite as you need for strings like 'AB' == 'A A', but that wasn’t specified in the question =)


Moreover, if a "word" could be defined as "capital letter followed by any number of spaces or the same letters", and that’s guaranteed, you can simplify isequal_with_spaces even further:

def isequal_with_spaces(s1, s2):
   return s1[0] == s2[0]
Answered By: Klas Š.

After using csv module to read the rows as lists, remove any spaces, and compare the characters:

ex: all(set(x.replace(' ', '')) == set(A[0]) for x in A)

>>> A = ['AA']
>>> D = ['D','D DD']
>>> all(set(x.replace(' ', '')) == set(A[0]) for x in A)
True
>>> all(set(x.replace(' ', '')) == set(D[0]) for x in D)
True

One way to handle this scenario is to use a dictionary that maps words to their equivalent words. For example:

word_map = {
    "A": "AA",
    "AA": "A",
    "BB": "B",
    "B": "BB",
    "C": "CC",
    "CC": "C",
    "D D": "D DD",
    "D DD": "D D",
}

Then, you can use this dictionary to check if the query is equivalent to any of the elements in the result_list:

equivalent_word = word_map.get(query, None)
if equivalent_word in result_list:
    sorted_list.append(equivalent_word)

You can add all the equivalent words to the word_map dictionary, and then use this dictionary to check if a word is equivalent to another word.

Answered By: Shahzeb Qureshi
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.