Values turn to Nan when indexing the keys in a dictionary in Pandas

Question:

i am trying to become a self taught data analyst.
In Pandas when i index different names in the second part of the code, the values turn from 450 to Nan, from 500 to Nan and 380 becomes 380.0 (float).
Also, the dtype turns from int64 to float64.
Any ideas why this happens?
Also if i copy an example from w3schools is it displayed fine.

import numpy as np
import pandas as pd


calories= {"Day 1": 450, "Day 2": 500, "day 3": 380}
new_series= pd.Series(calories)
print(new_series)

**#Second part of code**
new_series_1= pd.Series(calories, index=["day 1", "day 2", "day 3"])
print(new_series_1)
Asked By: festus

||

Answers:

I tried out your code. It is a simple fix. Python, like a lot of programs, is case sensitive. You just need to revise your statement.

Change from:

new_series_1= pd.Series(calories, index=["day 1", "day 2", "day 3"])

to:

new_series_1= pd.Series(calories, index=["Day 1", "Day 2", "day 3"])

Note the capital letters.

When I made sure that the column names matched, I got similar output.

Day 1    450
Day 2    500
day 3    380
dtype: int64
Day 1    450
Day 2    500
day 3    380
dtype: int64
Answered By: NoDakker

Summary

In new_series_1, calories keys don’t match with the index values, and the Series is being reindexed with the latter, hence the NaN and float64.

Explanation

First you initialize new_series with calories, which is a dict with int values:

calories= {"Day 1": 450, "Day 2": 500, "day 3": 380}
new_series= pd.Series(calories)

So Pandas knows they can be treated best as int64.

Then you set 2 different values in index, day 1 and day 2, no capitalized:

new_series_1= pd.Series(calories, index=["day 1", "day 2", "day 3"])

There was no more correspondence between calories‘s keys and index values, so Pandas defaulted to float64 for a best guess.
In fact, an example in the docs shows that:

Constructing Series from a dictionary with an Index specified

d = {'a': 1, 'b': 2, 'c': 3}
ser = pd.Series(data=d, index=['a', 'b', 'c'])
ser
a   1
b   2
c   3
dtype: int64

The keys of the dictionary match with the Index values, hence the Index values have no effect.

d = {'a': 1, 'b': 2, 'c': 3}
ser = pd.Series(data=d, index=['x', 'y', 'z'])
ser
x   NaN
y   NaN
z   NaN
dtype: float64

Note that the Index is first build with the keys from the dictionary.
After this the Series is reindexed with the given Index values, hence
we get all NaN as a result
.

And here it explains when it changes dtype, based on Index:

If dtype is None, we find the dtype that best fits the data. If an
actual dtype is provided, we coerce to that dtype if it’s safe.
Otherwise, an error will be raised.

Answered By: Jonathan Ciapetti
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.