keep x rows and delete all from csv file

Question

I want to be able to specify how many rows I want to keep and delete the rest, also preserving the header.

I found some code which let’s you delete the first 5 rows but how can I make it do what I want?

with open ('myfile.csv', 'wb') as outfile:
    outfile.writelines(data_in[1])
    outfile.writelines(data_in[5:])

For example if I have this CSV

6.5, 5.4, 0, 000
6.5, 5.4, 1, 610
1.2, 4.0, 0, 530
3.2, 5.4, 1, 330
4.2, 3.0, 0, 320
5.5, 2.3, 1, 780
1.3, 4.4, 0, 520
5.3, 1.0, 0, 420

I just want to specify a number to my script… let’s say (2) and it will KEEP 2 rows and remove all others

output would become:

6.5, 5.4, 0, 000
6.5, 5.4, 1, 610

Can i also make it save it with a different name?

Asked By: Saffik

||

Source

Answer 1

If you first read your original CSV-file into variable data_in with commands

with open('my_original_file.csv') as inp:
     data_in = inp.readlines()

you may continue:

n = int(input("How many rows after header you want to write: "))

with open('myfile.csv', 'w') as outfile:
    outfile.writelines(data_in[:n+1])

This will write

the header row — data_in[0], and
subsequent n rows — data_in[1] to data_in[n]

Answered By: MarianD

Answer 2

With pandas it is very easy to do, you can use head:

#reading the csv file (remove header=None if you have column names)
df = pd.read_csv('myfile.csv',header=None)

#selecting only first 2 rows
df = df.head(2)

#saving the csv file (remove header= None if you have column names)
df.to_csv('output.csv',index=False, header=False)

Or simply:

df = pd.read_csv('myfile.csv',header=None)
df.head(2).to_csv('output.csv',index=False, header=False)

Output:

6.5,5.4,0,0
6.5,5.4,1,610

Answered By: Grayrigel

Answer 3

Keeping the first n lines and remove everything else:

with open(filename, 'r+') as f:
    for i in range(n):
        f.readline() # read each line
    f.truncate(f.tell()) # terminate the file here

Answered By: user3503711

keep x rows and delete all from csv file

Question:

Answers: