Truncate number of rows in a pandas dataframe
Question:
Is there a method to limit the number of rows in a pandas dataframe, or is this best done by indexing, for example:
LIMIT = 1000
df = df[:LIMIT]
The reason I ask this is I may have million-row dataframes and I’d like to make sure this call is as efficient as possible, because I will be calling it quite a bit.
Answers:
If you are trying to limit the number of rows displayed, then the next command will be useful:
limit = 1000
pd.options.display.max_rows = limit
Or you could try to the next one:
limit = 1000
pd.set_option("display.max_rows",limit)
There are various options available, but you need to be specific what you need.
I personally use these settings:
##### widen output display to see more columns and rows in `pandas` ####
pd.set_option('display.height', 100)
pd.set_option('display.max_rows', 100)
pd.set_option('display.max_columns', 100)
pd.set_option('display.width', 100)
pd.set_option('expand_frame_repr', True)
(100’s above are just an example).
Extracting a subset of a pandas DataFrame
:
In general this is how to subset portions of a DataFrame
:
df.loc[start_row:end_row, start_column:end_column]
Selecting the initial n
rows from a DataFrame
:
df[:1000]
Is there a method to limit the number of rows in a pandas dataframe, or is this best done by indexing, for example:
LIMIT = 1000
df = df[:LIMIT]
The reason I ask this is I may have million-row dataframes and I’d like to make sure this call is as efficient as possible, because I will be calling it quite a bit.
If you are trying to limit the number of rows displayed, then the next command will be useful:
limit = 1000
pd.options.display.max_rows = limit
Or you could try to the next one:
limit = 1000
pd.set_option("display.max_rows",limit)
There are various options available, but you need to be specific what you need.
I personally use these settings:
##### widen output display to see more columns and rows in `pandas` ####
pd.set_option('display.height', 100)
pd.set_option('display.max_rows', 100)
pd.set_option('display.max_columns', 100)
pd.set_option('display.width', 100)
pd.set_option('expand_frame_repr', True)
(100’s above are just an example).
Extracting a subset of a pandas DataFrame
:
In general this is how to subset portions of a DataFrame
:
df.loc[start_row:end_row, start_column:end_column]
Selecting the initial n
rows from a DataFrame
:
df[:1000]