File read using "open()" vs "with open()"

Question:

I know there are plenty of articles and questions answered regarding reading files in python. But still I’m wondering what made python to have multiple ways to do the same task. Simply what I want to know is, what is the performance impact of using these two methods?

Asked By: Chamath

||

Answers:

Using with statement is not for performance gain, I do not think there are any performance gains or loss associated with using with statement, as long as, you perform the same cleanup activity that using with statement would perform automatically.

When you use with statement with open function, you do not need to close the file at the end, because with would automatically close it for you.

Also, with statement is not just for openning files, with is used in conjuction with context managers. Basically, if you have an object that you want to make sure it is cleaned once you are done with it or some kind of errors occur, you can define it as a context manager and with statement will call its __enter__() and __exit__() methods on entry to and exit from the with block. According to PEP 0343

This PEP adds a new statement "with" to the Python language to make it possible to factor out standard uses of try/finally statements.

In this PEP, context managers provide __enter__() and __exit__() methods that are invoked on entry to and exit from the body of the with statement.

Also, performance testing of using with and not using it –

In [14]: def foo():
   ....:     f = open('a.txt','r')
   ....:     for l in f:
   ....:         pass
   ....:     f.close()
   ....:

In [15]: def foo1():
   ....:     with open('a.txt','r') as f:
   ....:         for l in f:
   ....:             pass
   ....:

In [17]: %timeit foo()
The slowest run took 41.91 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 186 µs per loop

In [18]: %timeit foo1()
The slowest run took 206.14 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 179 µs per loop

In [19]: %timeit foo()
The slowest run took 202.51 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 180 µs per loop

In [20]: %timeit foo1()
10000 loops, best of 3: 193 µs per loop

In [21]: %timeit foo1()
10000 loops, best of 3: 194 µs per loop
Answered By: Anand S Kumar
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.