Elegant way to create empty pandas DataFrame with NaN of type float
Question:
I want to create a Pandas DataFrame filled with NaNs. During my research I found an answer:
import pandas as pd
df = pd.DataFrame(index=range(0,4),columns=['A'])
This code results in a DataFrame filled with NaNs of type “object”. So they cannot be used later on for example with the interpolate()
method. Therefore, I created the DataFrame with this complicated code (inspired by this answer):
import pandas as pd
import numpy as np
dummyarray = np.empty((4,1))
dummyarray[:] = np.nan
df = pd.DataFrame(dummyarray)
This results in a DataFrame filled with NaN of type “float”, so it can be used later on with interpolate()
. Is there a more elegant way to create the same result?
Answers:
You could specify the dtype directly when constructing the DataFrame:
>>> df = pd.DataFrame(index=range(0,4),columns=['A'], dtype='float')
>>> df.dtypes
A float64
dtype: object
Specifying the dtype forces Pandas to try creating the DataFrame with that type, rather than trying to infer it.
Simply pass the desired value as first argument, like 0
, math.inf
or, here, np.nan
. The constructor then initializes and fills the value array to the size specified by arguments index
and columns
:
>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame(np.nan, index=[0, 1, 2, 3], columns=['A', 'B'])
>>> df
A B
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
>>> df.dtypes
A float64
B float64
dtype: object
Hope this can help!
pd.DataFrame(np.nan, index = np.arange(<num_rows>), columns = ['A'])
You can try this line of code:
pdDataFrame = pd.DataFrame([np.nan] * 7)
This will create a pandas dataframe of size 7 with NaN of type float:
if you print pdDataFrame
the output will be:
0
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
6 NaN
Also the output for pdDataFrame.dtypes
is:
0 float64
dtype: object
For multiple columns you can do:
df = pd.DataFrame(np.zeros([nrow, ncol])*np.nan)
I want to create a Pandas DataFrame filled with NaNs. During my research I found an answer:
import pandas as pd
df = pd.DataFrame(index=range(0,4),columns=['A'])
This code results in a DataFrame filled with NaNs of type “object”. So they cannot be used later on for example with the interpolate()
method. Therefore, I created the DataFrame with this complicated code (inspired by this answer):
import pandas as pd
import numpy as np
dummyarray = np.empty((4,1))
dummyarray[:] = np.nan
df = pd.DataFrame(dummyarray)
This results in a DataFrame filled with NaN of type “float”, so it can be used later on with interpolate()
. Is there a more elegant way to create the same result?
You could specify the dtype directly when constructing the DataFrame:
>>> df = pd.DataFrame(index=range(0,4),columns=['A'], dtype='float')
>>> df.dtypes
A float64
dtype: object
Specifying the dtype forces Pandas to try creating the DataFrame with that type, rather than trying to infer it.
Simply pass the desired value as first argument, like 0
, math.inf
or, here, np.nan
. The constructor then initializes and fills the value array to the size specified by arguments index
and columns
:
>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame(np.nan, index=[0, 1, 2, 3], columns=['A', 'B'])
>>> df
A B
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
>>> df.dtypes
A float64
B float64
dtype: object
Hope this can help!
pd.DataFrame(np.nan, index = np.arange(<num_rows>), columns = ['A'])
You can try this line of code:
pdDataFrame = pd.DataFrame([np.nan] * 7)
This will create a pandas dataframe of size 7 with NaN of type float:
if you print pdDataFrame
the output will be:
0
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
6 NaN
Also the output for pdDataFrame.dtypes
is:
0 float64
dtype: object
For multiple columns you can do:
df = pd.DataFrame(np.zeros([nrow, ncol])*np.nan)