Performance effect of using print statements in Python script

Question:

I have a Python script that process a huge text file (with around 4 millon lines) and writes the data into two separate files.

I have added a print statement, which outputs a string for every line for debugging. I want to know how bad it could be from the performance perspective?

If it is going to very bad, I can remove the debugging line.

Edit

It turns out that having a print statement for every line in a file with 4 million lines is increasing the time way too much.

Asked By: Sudar

||

Answers:

Tried doing it in a very simple script just for fun, the difference is quite staggering:

In large.py:

target =  open('target.txt', 'w')

for item in xrange(4000000):
    target.write(str(item)+'n')
    print item

Timing it:

[gp@imdev1 /tmp]$ time python large.py
real    1m51.690s
user    0m10.531s
sys     0m6.129s

gp@imdev1 /tmp]$ ls -lah target.txt 
-rw-rw-r--. 1 gp gp 30M Nov  8 16:06 target.txt

Now running the same with “print” commented out:

gp@imdev1 /tmp]$ time python large.py 
real    0m2.584s
user    0m2.536s
sys     0m0.040s
Answered By: GSP

Yes it affects performance.
I wrote a small program to demonstrate-

import time
start_time=time.time()
for i in range(100):
    for j in range(100):
        for k in range(100):
            print(i,j,k)
print(time.time()-start_time)
input()

The time measured was-160.2812204496765
Then I replaced the print statement by pass. The results were shocking. The measured time without print was- 0.26517701148986816.

Answered By: Kshitij Joshi
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.