How to get a value from a Pandas DataFrame and not the index and object type
Question:
Say I have the following DataFrame
Letter Number
A 1
B 2
C 3
D 4
Which can be obtained through the following code
import pandas as pd
letters = pd.Series(('A', 'B', 'C', 'D'))
numbers = pd.Series((1, 2, 3, 4))
keys = ('Letters', 'Numbers')
df = pd.concat((letters, numbers), axis=1, keys=keys)
Now I want to get the value C from the column Letters.
The command line
df[df.Letters=='C'].Letters
will return
2 C
Name: Letters, dtype: object
How can I get only the value C and not the whole two line output?
Answers:
df[df.Letters=='C'].Letters.item()
This returns the first element in the Index/Series returned from that selection. In this case, the value is always the first element.
EDIT:
Or you can run a loc() and access the first element that way. This was shorter and is the way I have implemented it in the past.
Use the values
attribute to return the values as a np array and then use [0]
to get the first value:
In [4]:
df.loc[df.Letters=='C','Letters'].values[0]
Out[4]:
'C'
EDIT
I personally prefer to access the columns using subscript operators:
df.loc[df['Letters'] == 'C', 'Letters'].values[0]
This avoids issues where the column names can have spaces or dashes -
which mean that accessing using .
.
import pandas as pd
dataset = pd.read_csv("data.csv")
values = list(x for x in dataset["column name"])
>>> values[0]
'item_0'
edit:
actually, you can just index the dataset like any old array.
import pandas as pd
dataset = pd.read_csv("data.csv")
first_value = dataset["column name"][0]
>>> print(first_value)
'item_0'
You can use loc
with the index and column labels.
df.loc[2, 'Letters']
# 'C'
If you prefer the "Numbers" column as reference, you can set it as index.
df.set_index('Numbers').loc[3, 'Letters']
I find this cleaner as it does not need the [0]
or .item()
.
I think a good option is to turn your single line DataFrame into a Series first, then index that:
df[df.Letters=='C'].squeeze()['Letters']
Say I have the following DataFrame
Letter Number A 1 B 2 C 3 D 4
Which can be obtained through the following code
import pandas as pd
letters = pd.Series(('A', 'B', 'C', 'D'))
numbers = pd.Series((1, 2, 3, 4))
keys = ('Letters', 'Numbers')
df = pd.concat((letters, numbers), axis=1, keys=keys)
Now I want to get the value C from the column Letters.
The command line
df[df.Letters=='C'].Letters
will return
2 C Name: Letters, dtype: object
How can I get only the value C and not the whole two line output?
df[df.Letters=='C'].Letters.item()
This returns the first element in the Index/Series returned from that selection. In this case, the value is always the first element.
EDIT:
Or you can run a loc() and access the first element that way. This was shorter and is the way I have implemented it in the past.
Use the values
attribute to return the values as a np array and then use [0]
to get the first value:
In [4]:
df.loc[df.Letters=='C','Letters'].values[0]
Out[4]:
'C'
EDIT
I personally prefer to access the columns using subscript operators:
df.loc[df['Letters'] == 'C', 'Letters'].values[0]
This avoids issues where the column names can have spaces or dashes -
which mean that accessing using .
.
import pandas as pd
dataset = pd.read_csv("data.csv")
values = list(x for x in dataset["column name"])
>>> values[0]
'item_0'
edit:
actually, you can just index the dataset like any old array.
import pandas as pd
dataset = pd.read_csv("data.csv")
first_value = dataset["column name"][0]
>>> print(first_value)
'item_0'
You can use loc
with the index and column labels.
df.loc[2, 'Letters']
# 'C'
If you prefer the "Numbers" column as reference, you can set it as index.
df.set_index('Numbers').loc[3, 'Letters']
I find this cleaner as it does not need the [0]
or .item()
.
I think a good option is to turn your single line DataFrame into a Series first, then index that:
df[df.Letters=='C'].squeeze()['Letters']