Size of an open file object

Question:

Is there a way to find the size of a file object that is currently open?

Specifically, I am working with the tarfile module to create tarfiles, but I don’t want my tarfile to exceed a certain size. As far as I know, tarfile objects are file-like objects, so I imagine a generic solution would work.

Asked By: strider1551

||

Answers:

If you have the file descriptor, you can use fstat to find out the size, if any. A more generic solution is to seek to the end of the file, and read its location there.

Answered By: C. K. Young
$ ls -la chardet-1.0.1.tgz
-rwxr-xr-x 1 vinko vinko 179218 2008-10-20 17:49 chardet-1.0.1.tgz
$ python
Python 2.5.1 (r251:54863, Jul 31 2008, 22:53:39)
[GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> f = open('chardet-1.0.1.tgz','rb')
>>> f.seek(0, os.SEEK_END)
>>> f.tell()
179218L

Adding ChrisJY’s idea to the example

>>> import os
>>> os.fstat(f.fileno()).st_size
179218L
>>>        

Note: Based on the comments, f.seek(0, os.SEEK_END) is must before calling f.tell(), without which it would return a size of 0. The reason is that f.seek(0, os.SEEK_END) moves the file object’s position to the end of the file.

Answered By: Vinko Vrsalovic

Well, if the file object support the tell method, you can do:

current_size = f.tell()

That will tell you were it is currently writing. If you write in a sequential way this will be the size of the file.

Otherwise, you can use the file system capabilities, i.e. os.fstat as suggested by others.

Answered By: PierreBdR

Another solution is using StringIO “if you are doing in-memory operations”.

with open(file_path, 'rb') as x:
    body = StringIO()
    body.write(x.read())
    body.seek(0, 0)

Now body behaves like a file object with various attributes like body.read().

body.len gives the file size.

Answered By: vestronge

I was curious about the performance implications of both, since once you open a file, the name attribute of the handle gives you the filename (so you can call os.stat on it).

Here’s a function for the seek/tell method:

import io
def seek_size(f):
    pos = f.tell()
    f.seek(0, io.SEEK_END)
    size = f.tell()
    f.seek(pos) # back to where we were
    return size

With a 65 MiB file on an SSD, Windows 10, this is some 6.5x faster than calling os.stat(f.name)

Answered By: darda