Search for an exact match of a string in pandas.DataFrame

Question:

I have a DataFrame as follows:

data = [
    ['2022-12-04 00:00:00', 5000.00],
    ['2022-12-04 00:00:00', 6799.50],
    ['2022-12-04 00:00:00', 5000.00],
    ['2023-01-10 00:00:00', 5000.00]
]

df = pd.DataFrame(data, columns=['Date', 'Float'])


date_input = "2022-12-04 00:00:00"
float_input = "5000.00"

What would be the best way to check if there is a string in DF with an exact match of the values ‘date’ and ‘float’.

In such case, I expect the ‘Yes’ output, since such combination of ‘date’ and ‘float’ is contained in the first line of DataFrame.

I tried it like this, but it doesn’t help to determine if there is a match of ‘float_input’ values for a certain ‘date_input’ date

if ((df['Date'] == pd.Timestamp(date_input)).any()) and (df['Float'] == float(float_input).any():
print('YES')
else:
print("No")

Asked By: Djoe

||

Answers:

If I am understanding your problem correctly, I believe you can check if there is a string in DF with an exact match of the values ‘date’ and ‘float’ by using the loc method of the DataFrame to select the rows that match the ‘date_input’ and ‘float_input’ values.

Then, you can check if the resulting DataFrame is empty or not, which indicates that there is at least one row with the exact match of ‘date_input’ and ‘float_input’ values.

Modified Code

date_input = "2022-12-04 00:00:00"
float_input = "5000.00"

# Use loc to select the rows that match the values
matches = df.loc[(df['Date'] == pd.Timestamp(date_input)) & (df['Float'] == float(float_input))]

# Check if the resulting DataFrame is empty or not
if not matches.empty:
    print('YES')
else:
    print('NO')
Answered By: CRM000

if [date_input,float_input] in df.values:

print(‘yes’)

else:

print(‘no’)

Answered By: ilshatt

Good night friend, I performed the process you tried. First, let’s break it down:

Step 1:

enter image description here

Your data set has incorrect values, you have to correct the date to a string before placing it in pd.DataFrame.

enter image description here

data = [
['2022-12-04 00:00:00', 5000.00],
['2022-12-04 00:00:00', 6799.50],
['2022-12-04 00:00:00', 5000.00],
['2023-01-10 00:00:00', 5000.00]]

Step 2:

After solving the previous question, we perform the dateframe process:

enter image description here

Now we can visualize the dateframe, right after that we see that the column that one of the columns we are looking for has a type problem, so we must carry out the conversion

Step 3:

enter image description here

Using one of the pandas tools, we can convert the type to datetime, you can see that I used the dayfirst parameter, this is optional, but since we don’t have hours, minutes and seconds. I didn’t choose telos. I followed the documentation which you will find quite interesting.

df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)

Step 4:

Let’s go to the consultations, in a simple way, performing a consultation by parts, looking for one at a time. We found your values query

SearchDate =  df[df['Date'] == pd.Timestamp("2022-12-04 00:00:00'")]
SearchNumber =  SearchDate[SearchDate['Float'] == 5000.00]
SearchNumber

enter image description here

Another way to search is for the indexes that return True from each query, using the and operator, we manage to return the values of the query

searchDate =  df['Date'] == pd.Timestamp("2022-12-04 00:00:00'") 
seachFloat = df['Float'] == 5000.00
query = df[searchDate & seachFloat]
query

enter image description here

Carrying out the table in the previous step in another form of execution

query_dataframe = df[(df['Date'] == pd.Timestamp("2022-12-04 00:00:00")) & (df['Float'] == 5000.00)]
query_dataframe

enter image description here

step 5:

This check you put I didn’t understand very well. In a simple way, I checked if the variable is empty, I performed the return as in your example

if len(query_dataframe):
    print('Yes')
else:
    print('No')

enter image description here

Answered By: Eddi