How can I only show the numbers on a newly calculated df column?

Question

I’m having an issue where calculated results in "Price Tier Index" column are showing up as "object" and not as an int (see code and screenshot below).

import pandas as pd
import numpy as np
import csv
import io
from IPython.display import display
from google.colab import files
uploaded = files.upload()    

#this is a variable to be used in the new Price Tier Index column  
men_bw_avg_eq_price = 0.28

df = pd.read_csv(io.BytesIO(uploaded["Men BW - competitive price tier analysis - REPULLED.csv"]))
df= df.fillna(0)

price_index = []
total_rows = len(df['Products']) 
for i in range(total_rows): 
      pi_value = (df.loc[i, ['Avg EQ Price']]/men_bw_avg_eq_price) * 100  
      price_index.append(pi_value)

df['Price Tier Index'] = price_index
df[['Brand', 'Gender', 'Subcategory']] = df.Products.str.split("|", expand=True)
df_new = df[['Brand', 'Price Tier Index']]
display(df_new)

screenshot of the current issue

screenshot of what I’m looking to do

Thank you in advance for your help!

Asked By: Baek Kim

||

Source

Answer 1

All problem is because you create list of dataframes instead of list of values

You need .values[0] or .to_list()[0] to convert data to list and get first value

first_value = df.loc[i, ['Avg EQ Price']].values[0]
pi_value = (first_value/men_bw_avg_eq_price) * 100

first_value = df.loc[i, ['Avg EQ Price']].to_list()[0]
pi_value = (first_value/men_bw_avg_eq_price) * 100

But you can calculate it without for-loop

df['Price Tier Index'] = (df['Avg EQ Price']/men_bw_avg_eq_price) * 100

Minimal working example with some data

import pandas as pd

#this is a variable to be used in the new Price Tier Index column  
men_bw_avg_eq_price = 0.28

data = {
    'Products': ['A|B|C','D|E|F','G|H|I'], 
    'Avg EQ Price': [4,5,6], 
}
df = pd.DataFrame(data)
#df = pd.read_csv(io.BytesIO(uploaded["Men BW - competitive price tier analysis - REPULLED.csv"]))
df = df.fillna(0)

#price_index = []
#total_rows = len(df['Products'])
#for i in range(total_rows): 
#      #first_value = df.loc[i, ['Avg EQ Price']].values[0]
#      first_value = df.loc[i, ['Avg EQ Price']].to_list()[0]
#      pi_value = (first_value/men_bw_avg_eq_price) * 100  
#      price_index.append(pi_value)

price_index = (df['Avg EQ Price']/men_bw_avg_eq_price) * 100  

df['Price Tier Index'] = price_index
df[['Brand', 'Gender', 'Subcategory']] = df.Products.str.split("|", expand=True)
df_new = df[['Brand', 'Price Tier Index']]

print(df_new)

Result:

  Brand  Price Tier Index
0     A       1428.571429
1     D       1785.714286
2     G       2142.857143

Answered By: furas

How can I only show the numbers on a newly calculated df column?

Question:

Answers: