How do I use my first row in my spreadsheet for my Dataframe column names instead of 0 1 2…etc?
Question:
I want my dataframe to display the first row names as my dataframe column name instead of numbering from 0 etc. How do I do this?
I tried using pandas and openpyxl modules to turn my Excel spreadsheet into a dataframe.
import pandas as pd
from openpyxl import load_workbook
from openpyxl.utils.dataframe import dataframe_to_rows
wb = load_workbook(filename='Budget1.xlsx')
print(wb.sheetnames)
sheet_ranges=wb['May 2019']
print(sheet_ranges['A3'].value)
ws=wb['May 2019']
df=pd.DataFrame(ws.values)
print(df) # This displays my dataframe.
I expect my column titles of my dataframe to display Date, Description, and Amount instead of 0, 1, 2.
Answers:
After reading data dataframe using pandas you can separate first row then use that as column name:
columnNames = df.iloc[0]
df = df[1:]
df.columns = columnNames
Or, you can directly read using pandas that will set first row as column name:
excelDF = pd.ExcelFile('Budget1.xlsx')
df1 = pd.read_excel(excelDF, 'SheetNameThatYouWantTORead')
print(df1.columns)
import pandas as pd
data = pd.read_excel("filename.xlsx", sheetname=’xyz’, header=0, skiprows=[0])
header=0 will read first row as header while skiprows will let you skip first row as data
I want my dataframe to display the first row names as my dataframe column name instead of numbering from 0 etc. How do I do this?
I tried using pandas and openpyxl modules to turn my Excel spreadsheet into a dataframe.
import pandas as pd
from openpyxl import load_workbook
from openpyxl.utils.dataframe import dataframe_to_rows
wb = load_workbook(filename='Budget1.xlsx')
print(wb.sheetnames)
sheet_ranges=wb['May 2019']
print(sheet_ranges['A3'].value)
ws=wb['May 2019']
df=pd.DataFrame(ws.values)
print(df) # This displays my dataframe.
I expect my column titles of my dataframe to display Date, Description, and Amount instead of 0, 1, 2.
After reading data dataframe using pandas you can separate first row then use that as column name:
columnNames = df.iloc[0]
df = df[1:]
df.columns = columnNames
Or, you can directly read using pandas that will set first row as column name:
excelDF = pd.ExcelFile('Budget1.xlsx')
df1 = pd.read_excel(excelDF, 'SheetNameThatYouWantTORead')
print(df1.columns)
import pandas as pd
data = pd.read_excel("filename.xlsx", sheetname=’xyz’, header=0, skiprows=[0])
header=0 will read first row as header while skiprows will let you skip first row as data