How to count the number of files in a directory using Python

Question:

How do I count only the files in a directory? This counts the directory itself as a file:

len(glob.glob('*'))
Asked By: prosseek

||

Answers:

os.listdir() will be slightly more efficient than using glob.glob. To test if a filename is an ordinary file (and not a directory or other entity), use os.path.isfile():

import os, os.path

# simple version for working with CWD
print len([name for name in os.listdir('.') if os.path.isfile(name)])

# path joining version for other paths
DIR = '/tmp'
print len([name for name in os.listdir(DIR) if os.path.isfile(os.path.join(DIR, name))])
Answered By: Daniel Stutzbach

This uses os.listdir and works for any directory:

import os
directory = 'mydirpath'

number_of_files = len([item for item in os.listdir(directory) if os.path.isfile(os.path.join(directory, item))])

this can be simplified with a generator and made a little bit faster with:

import os
isfile = os.path.isfile
join = os.path.join

directory = 'mydirpath'
number_of_files = sum(1 for item in os.listdir(directory) if isfile(join(directory, item)))
Answered By: joaquin
def count_em(valid_path):
   x = 0
   for root, dirs, files in os.walk(valid_path):
       for f in files:
            x = x+1
print "There are", x, "files in this directory."
return x

Taked from this post

Answered By: Kristian Damian
import os

def count_files(in_directory):
    joiner= (in_directory + os.path.sep).__add__
    return sum(
        os.path.isfile(filename)
        for filename
        in map(joiner, os.listdir(in_directory))
    )

>>> count_files("/usr/lib")
1797
>>> len(os.listdir("/usr/lib"))
2049
Answered By: tzot
import os

_, _, files = next(os.walk("/usr/lib"))
file_count = len(files)
Answered By: Luke

Luke’s code reformat.

import os

print len(os.walk('/usr/lib').next()[2])
Answered By: okobaka
def directory(path,extension):
  list_dir = []
  list_dir = os.listdir(path)
  count = 0
  for file in list_dir:
    if file.endswith(extension): # eg: '.txt'
      count += 1
  return count
Answered By: ninjrok

This is where fnmatch comes very handy:

import fnmatch

print len(fnmatch.filter(os.listdir(dirpath), '*.txt'))

More details: http://docs.python.org/2/library/fnmatch.html

Answered By: ngeek
import os
print len(os.listdir(os.getcwd()))
Answered By: rash
import os

total_con=os.listdir('<directory path>')

files=[]

for f_n in total_con:
   if os.path.isfile(f_n):
     files.append(f_n)


print len(files)
Answered By: Mohit Dabas

If you’ll be using the standard shell of the operating system, you can get the result much faster rather than using pure pythonic way.

Example for Windows:

import os
import subprocess

def get_num_files(path):
    cmd = 'DIR "%s" /A-D /B /S | FIND /C /V ""' % path
    return int(subprocess.check_output(cmd, shell=True))
Answered By: styler

I found another answer which may be correct as accepted answer.

for root, dirs, files in os.walk(input_path):    
for name in files:
    if os.path.splitext(name)[1] == '.TXT' or os.path.splitext(name)[1] == '.txt':
        datafiles.append(os.path.join(root,name)) 


print len(files) 
Answered By: Ismail

For all kind of files, subdirectories included (Python 2):

import os

lst = os.listdir(directory) # your directory path
number_files = len(lst)
print number_files

Only files (avoiding subdirectories):

import os

onlyfiles = next(os.walk(directory))[2] #directory is your directory path as string
print len(onlyfiles)
Answered By: Guillermo Pereira

Here is a simple one-line command that I found useful:

print int(os.popen("ls | wc -l").read())
Answered By: Bojan Tunguz

I used glob.iglob for a directory structure similar to

data
└───train
│   └───subfolder1
│   |   │   file111.png
│   |   │   file112.png
│   |   │   ...
│   |
│   └───subfolder2
│       │   file121.png
│       │   file122.png
│       │   ...
└───test
    │   file221.png
    │   file222.png

Both of the following options return 4 (as expected, i.e. does not count the subfolders themselves)

  • len(list(glob.iglob("data/train/*/*.png", recursive=True)))
  • sum(1 for i in glob.iglob("data/train/*/*.png"))
Answered By: user799188

i did this and this returned the number of files in the folder(Attack_Data)…this works fine.

import os
def fcount(path):
    #Counts the number of files in a directory
    count = 0
    for f in os.listdir(path):
        if os.path.isfile(os.path.join(path, f)):
            count += 1

    return count
path = r"C:UsersEE EKORODesktopAttack_Data" #Read files in folder
print (fcount(path))
Answered By: Sam Ekoro

I am surprised that nobody mentioned os.scandir:

def count_files(dir):
    return len([1 for x in list(os.scandir(dir)) if x.is_file()])
Answered By: qed

If you want to count all files in the directory – including files in subdirectories, the most pythonic way is:

import os

file_count = sum(len(files) for _, _, files in os.walk(r'C:Dropbox'))
print(file_count)

We use sum that is faster than explicitly adding the file counts (timings pending)

Answered By: Mr_and_Mrs_D

It is simple:

print(len([iq for iq in os.scandir('PATH')]))

it simply counts number of files in directory , i have used list comprehension technique to iterate through specific directory returning all files in return . “len(returned list)” returns number of files.

Answered By: Agha Saad

While I agree with the answer provided by @DanielStutzbach: os.listdir() will be slightly more efficient than using glob.glob.

However, an extra precision, if you do want to count the number of specific files in folder, you want to use len(glob.glob()). For instance if you were to count all the pdfs in a folder you want to use:

pdfCounter = len(glob.glob1(myPath,"*.pdf"))
Answered By: LBes

I solved this problem while calculating the number of files in a google drive directory through Google Colab by directing myself into the directory folder by

import os                                                                                                
%cd /content/drive/My Drive/  
print(len([x for x in os.listdir('folder_name/']))  

Normal user can try

 import os                                                                                                     
 cd Desktop/Maheep/                                                     
 print(len([x for x in os.listdir('folder_name/']))  
Answered By: Maheep

one liner and recursive:

def count_files(path):
    return sum([len(files) for _, _, files in os.walk(path)])

count_files('path/to/dir')
Answered By: juan Isaza

An answer with pathlib and without loading the whole list to memory:

from pathlib import Path

path = Path('.')

print(sum(1 for _ in path.glob('*')))  # Files and folders, not recursive
print(sum(1 for _ in path.glob('**/*')))  # Files and folders, recursive

print(sum(1 for x in path.glob('*') if x.is_file()))  # Only files, not recursive
print(sum(1 for x in path.glob('**/*') if x.is_file()))  # Only files, recursive
Answered By: Paul

Short and simple

import os
directory_path = '/home/xyz/'
No_of_files = len(os.listdir(directory_path))

A simple utility function I wrote that makes use of os.scandir() instead of os.listdir().

import os 

def count_files_in_dir(path: str) -> int:
    file_entries = [entry for entry in os.scandir(path) if entry.is_file()]

    return len(file_entries)

The main benefit is that, the need for os.path.is_file() is eliminated and replaced with os.DirEntry instance’s is_file() which also removes the need for os.path.join(DIR, file_name) as shown in other answers.

Answered By: Kinyugo

This is an easy solution that counts the number of files in a directory containing sub-folders. It may come in handy:

import os
from pathlib import Path

def count_files(rootdir):
    '''counts the number of files in each subfolder in a directory'''
    for path in pathlib.Path(rootdir).iterdir():
        if path.is_dir():
            print("There are " + str(len([name for name in os.listdir(path) 
            if os.path.isfile(os.path.join(path, name))])) + " files in " + 
            str(path.name))
            
 
count_files(data_dir) # data_dir is the directory you want files counted.

You should get an output similar to this (with the placeholders changed, of course):

There are {number of files} files in {name of sub-folder1}
There are {number of files} files in {name of sub-folder2}
Answered By: MLDev

Convert to list after that you can Len

len(list(glob.glob(‘*’)))

Answered By: Eslamspot

Simpler one:

import os
number_of_files = len(os.listdir(directory))
print(number_of_files)
Answered By: Mayur Gupta

I find that sometimes I don’t know if I will receive filenames or the path to the file. So I printed the os walk solution output:

def count_number_of_raw_data_point_files(path: Union[str, Path], with_file_prefix: str) -> int:
    import os
    path: Path = force_expanduser(path)

    _, _, files = next(os.walk(path))
    # file_count = len(files)
    filename: str
    count: int = 0
    for filename in files:
        print(f'-->{filename=}')  # e.g. print -->filename='data_point_99.json'
        if with_file_prefix in filename:
            count += 1
    return count

out:

-->filename='data_point_780.json'
-->filename='data_point_781.json'
-->filename='data_point_782.json'
-->filename='data_point_783.json'
-->filename='data_point_784.json'
-->filename='data_point_785.json'
-->filename='data_point_786.json'
-->filename='data_point_787.json'
-->filename='data_point_788.json'
-->filename='data_point_789.json'
-->filename='data_point_79.json'
-->filename='data_point_790.json'
-->filename='data_point_791.json'
-->filename='data_point_792.json'
-->filename='data_point_793.json'
-->filename='data_point_794.json'
-->filename='data_point_795.json'
-->filename='data_point_796.json'
-->filename='data_point_797.json'
-->filename='data_point_798.json'
-->filename='data_point_799.json'
-->filename='data_point_8.json'
-->filename='data_point_80.json'
-->filename='data_point_800.json'
-->filename='data_point_801.json'
-->filename='data_point_802.json'
-->filename='data_point_803.json'
-->filename='data_point_804.json'
-->filename='data_point_805.json'
-->filename='data_point_806.json'
-->filename='data_point_807.json'
-->filename='data_point_808.json'
-->filename='data_point_809.json'
-->filename='data_point_81.json'
-->filename='data_point_810.json'
-->filename='data_point_811.json'
-->filename='data_point_812.json'
-->filename='data_point_813.json'
-->filename='data_point_814.json'
-->filename='data_point_815.json'
-->filename='data_point_816.json'
-->filename='data_point_817.json'
-->filename='data_point_818.json'
-->filename='data_point_819.json'
-->filename='data_point_82.json'
-->filename='data_point_820.json'
-->filename='data_point_821.json'
-->filename='data_point_822.json'
-->filename='data_point_823.json'
-->filename='data_point_824.json'
-->filename='data_point_825.json'
-->filename='data_point_826.json'
-->filename='data_point_827.json'
-->filename='data_point_828.json'
-->filename='data_point_829.json'
-->filename='data_point_83.json'
-->filename='data_point_830.json'
-->filename='data_point_831.json'
-->filename='data_point_832.json'
-->filename='data_point_833.json'
-->filename='data_point_834.json'
-->filename='data_point_835.json'
-->filename='data_point_836.json'
-->filename='data_point_837.json'
-->filename='data_point_838.json'
-->filename='data_point_839.json'
-->filename='data_point_84.json'
-->filename='data_point_840.json'
-->filename='data_point_841.json'
-->filename='data_point_842.json'
-->filename='data_point_843.json'
-->filename='data_point_844.json'
-->filename='data_point_845.json'
-->filename='data_point_846.json'
-->filename='data_point_847.json'
-->filename='data_point_848.json'
-->filename='data_point_849.json'
-->filename='data_point_85.json'
-->filename='data_point_850.json'
-->filename='data_point_851.json'
-->filename='data_point_852.json'
-->filename='data_point_853.json'
-->filename='data_point_86.json'
-->filename='data_point_87.json'
-->filename='data_point_88.json'
-->filename='data_point_89.json'
-->filename='data_point_9.json'
-->filename='data_point_90.json'
-->filename='data_point_91.json'
-->filename='data_point_92.json'
-->filename='data_point_93.json'
-->filename='data_point_94.json'
-->filename='data_point_95.json'
-->filename='data_point_96.json'
-->filename='data_point_97.json'
-->filename='data_point_98.json'
-->filename='data_point_99.json'
854

note you might have to sort.

Answered By: Charlie Parker
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.