Pandas Get a List Of All Data Frames loaded into memory
Question:
I am using pandas to read several csv files into memory for processing and at some point would like to list all the data frames I have loaded into memory. Is there a simple way to do that? (I am thinking something like %ls but only for the data frames that I have available in memory)
Answers:
You could list all dataframes with the following:
import pandas as pd
# create dummy dataframes
df1 = pd.DataFrame({'Col1' : list(range(100))})
df2 = pd.DataFrame({'Col1' : list(range(100))})
# check whether all variables in scope are pandas dataframe.
# Dir() will return a list of string representations of the variables.
# Simply evaluate and test whether they are pandas dataframes
alldfs = [var for var in dir() if isinstance(eval(var), pd.core.frame.DataFrame)]
print(alldfs) # df1, df2
I personally think this approach is much better (if in ipython).
import pandas as pd
%whos DataFrame
building on previous answers …
this returns a list
import pandas as pd
%who_ls DataFrame
however, if you try to run a script it doesn’t work
thus
import pandas as pd
sheets=[]
for var in dir():
if isinstance(locals()[var], pd.core.frame.DataFrame) and var[0]!='_':
sheets.append(var)
since some DataFrames will have a copy for internal use only and those start with ‘_’
In case you want to have all the dataframes in a list which is itteratable you, you want to concatenate all dataframes, and their number will grow or names are going to change this is the way
#Output all the dataframe
alldfs = [var for var in dir() if isinstance(eval(var), pd.core.frame.DataFrame)]
#Create a list of itteratable dataframes
list_of_dfs = []
for df in alldfs:
list_of_dfs.append(locals()[df])
In case you have multiple dataframes, and there are ones which you dot not want to concatenate or perform other operations, you can put them in small dataframe, filter them and chose the desired ones.
I am using pandas to read several csv files into memory for processing and at some point would like to list all the data frames I have loaded into memory. Is there a simple way to do that? (I am thinking something like %ls but only for the data frames that I have available in memory)
You could list all dataframes with the following:
import pandas as pd
# create dummy dataframes
df1 = pd.DataFrame({'Col1' : list(range(100))})
df2 = pd.DataFrame({'Col1' : list(range(100))})
# check whether all variables in scope are pandas dataframe.
# Dir() will return a list of string representations of the variables.
# Simply evaluate and test whether they are pandas dataframes
alldfs = [var for var in dir() if isinstance(eval(var), pd.core.frame.DataFrame)]
print(alldfs) # df1, df2
I personally think this approach is much better (if in ipython).
import pandas as pd
%whos DataFrame
building on previous answers …
this returns a list
import pandas as pd
%who_ls DataFrame
however, if you try to run a script it doesn’t work
thus
import pandas as pd
sheets=[]
for var in dir():
if isinstance(locals()[var], pd.core.frame.DataFrame) and var[0]!='_':
sheets.append(var)
since some DataFrames will have a copy for internal use only and those start with ‘_’
In case you want to have all the dataframes in a list which is itteratable you, you want to concatenate all dataframes, and their number will grow or names are going to change this is the way
#Output all the dataframe
alldfs = [var for var in dir() if isinstance(eval(var), pd.core.frame.DataFrame)]
#Create a list of itteratable dataframes
list_of_dfs = []
for df in alldfs:
list_of_dfs.append(locals()[df])
In case you have multiple dataframes, and there are ones which you dot not want to concatenate or perform other operations, you can put them in small dataframe, filter them and chose the desired ones.