How to print a specific row of a pandas DataFrame?
Question:
I have a massive DataFrame, and I’m getting the error:
TypeError: ("Empty 'DataFrame': no numeric data to plot", 'occurred at index 159220')
I’ve already dropped nulls, and checked dtypes for the DataFrame so I have no guess as to why it’s failing on that row.
How do I print out just that row (at index 159220) of the DataFrame?
Answers:
Use ix
operator:
print df.ix[159220]
Sounds like you’re calling df.plot()
. That error indicates that you’re trying to plot a frame that has no numeric data. The data types shouldn’t affect what you print()
.
Use print(df.iloc[159220])
When you call loc
with a scalar value, you get a pd.Series
. That series will then have one dtype
. If you want to see the row as it is in the dataframe, you’ll want to pass an array like indexer to loc
.
Wrap your index value with an additional pair of square brackets
print(df.loc[[159220]])
To print a specific row, we have couple of pandas methods:
loc
– It only gets the label i.e. column name or features
iloc
– Here i stands for integer, representing the row number
ix
– It is a mix of label as well as integer (not available in pandas >=1.0)
Below are examples of how to use the first two options for a specific row:
loc
df.loc[row,column]
For the first row and all columns:
df.loc[0,:]
For the first row and some specific column:
df.loc[0,'column_name']
iloc
For the first row and all columns:
df.iloc[0,:]
For the first row and some specific columns i.e. first three cols:
df.iloc[0,0:3]
If you want to display at row=159220
row=159220
#To display in a table format, you can use with or without display()
df.iloc[row:row+1]
df.loc[row:row] #Not recommended
#To display in print format, you can use with or without display()
df.iloc[row]
df.loc[row] #Not recommended
You can also index the index and use the result to select row(s) using loc
:
row = 159220 # this creates a pandas Series (`row` is an integer)
row = [159220] # this creates a pandas DataFrame (`row` is a list)
df.loc[df.index[row]]
This is especially useful if you want to select rows by integer-location and columns by name. For example:
rows = 159220
cols = ['col2', 'col6']
df.loc[df.index[row], cols] # <--- OK
df.iloc[rows, cols] # <--- doesn't work
df.loc[cols].iloc[rows] # <--- OK but creates an intermediate copy
Print a specific row of a pandas DataFrame with example code with output:
Code:
import pandas as pd
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emily'],
'Age': [25, 30, 22, 35, 28],
'Salary': [50000, 60000, 45000, 70000, 55000]
}
df = pd.DataFrame(data)
row_i_want_to_print = 3
specific_row = df.iloc[row_i_want_to_print-1] # here I want to print row=3 which index=2. so I did row_i_want_to_print-1 to print actual row
print("Selected Row: ")
print(specific_row)
Output:
Selected Row:
Name Charlie
Age 22
Salary 45000
Name: 2, dtype: object
I have a massive DataFrame, and I’m getting the error:
TypeError: ("Empty 'DataFrame': no numeric data to plot", 'occurred at index 159220')
I’ve already dropped nulls, and checked dtypes for the DataFrame so I have no guess as to why it’s failing on that row.
How do I print out just that row (at index 159220) of the DataFrame?
Use ix
operator:
print df.ix[159220]
Sounds like you’re calling df.plot()
. That error indicates that you’re trying to plot a frame that has no numeric data. The data types shouldn’t affect what you print()
.
Use print(df.iloc[159220])
When you call loc
with a scalar value, you get a pd.Series
. That series will then have one dtype
. If you want to see the row as it is in the dataframe, you’ll want to pass an array like indexer to loc
.
Wrap your index value with an additional pair of square brackets
print(df.loc[[159220]])
To print a specific row, we have couple of pandas methods:
loc
– It only gets the label i.e. column name or featuresiloc
– Here i stands for integer, representing the row numberix
– It is a mix of label as well as integer (not available in pandas >=1.0)
Below are examples of how to use the first two options for a specific row:
loc
df.loc[row,column]
For the first row and all columns:
df.loc[0,:]
For the first row and some specific column:
df.loc[0,'column_name']
iloc
For the first row and all columns:
df.iloc[0,:]
For the first row and some specific columns i.e. first three cols:
df.iloc[0,0:3]
If you want to display at row=159220
row=159220
#To display in a table format, you can use with or without display()
df.iloc[row:row+1]
df.loc[row:row] #Not recommended
#To display in print format, you can use with or without display()
df.iloc[row]
df.loc[row] #Not recommended
You can also index the index and use the result to select row(s) using loc
:
row = 159220 # this creates a pandas Series (`row` is an integer)
row = [159220] # this creates a pandas DataFrame (`row` is a list)
df.loc[df.index[row]]
This is especially useful if you want to select rows by integer-location and columns by name. For example:
rows = 159220
cols = ['col2', 'col6']
df.loc[df.index[row], cols] # <--- OK
df.iloc[rows, cols] # <--- doesn't work
df.loc[cols].iloc[rows] # <--- OK but creates an intermediate copy
Print a specific row of a pandas DataFrame with example code with output:
Code:
import pandas as pd
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emily'],
'Age': [25, 30, 22, 35, 28],
'Salary': [50000, 60000, 45000, 70000, 55000]
}
df = pd.DataFrame(data)
row_i_want_to_print = 3
specific_row = df.iloc[row_i_want_to_print-1] # here I want to print row=3 which index=2. so I did row_i_want_to_print-1 to print actual row
print("Selected Row: ")
print(specific_row)
Output:
Selected Row:
Name Charlie
Age 22
Salary 45000
Name: 2, dtype: object