How to limit rows in pandas dataframe?

Question:

How to limit number of rows in pandas dataframe in python code. I needed last 1000 rows the rest need to delete.
For example 1000 rows, in pandas dataframe -> 1000 rows in csv.

I tried df.iloc[:1000]

I needed autoclean pandas dataframe and saving last 1000 rows.

Asked By: pozhilou

||

Answers:

If you want first 1000 records you can use:

df = df.head(1000)
Answered By: CezarySzulc

Are you trying to limit the number of rows when importing a csv, or when exporting a dataframe to a new csv file?

Importing first 1000 rows of csv:

df_limited = pd.read_csv(file, nrows=1000)

Get first 1000 rows of a dataframe (for export):

df_limited = df.head(1000)

Get last 1000 rows of a dataframe (for export):

df_limited = df.tail(1000)

Edit 1
As you are exporting a csv:
You can make a range selection with [n:m] where n is the starting point of your selection and m is the end point.
It works like this:
If the number is positive, it’s counting from the top of the list, beginning of the string, top of the dataframe etc.
If the number is negative, it counts from the back.

  • [5:] selects everything from the 5th element to the end (as there is
    no end point given)
  • [3:8] selects everything from the 3rd element up to the 8th
  • [5:-2] selects everything from the 5th element up to the 2nd to last
    (the 2nd from the back)
  • [-1000:] the start point is 1000 elements from the back, the end
    point is the last element (this is what you wanted, i think)
  • [:1000] selects the first 1000 lines (start point is the beginning, as there is no number given, end point is 1000 elements from the front)

Edit 2
After a quick check (and a very simple benchmark) it looks like df.tail(1000) is significantly faster than df.iloc[-1000:]

With df.iloc[:1000] you get the first 1000 rows.

Since you want to get the last 1000 rows, you have to change this line a bit to df_last_1000 = df.iloc[-1000:]

To safe it as a csv file you can use pandasto_csv() method: df_last_1000.to_csv("last_1000.csv")

Answered By: Jan
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.