Filter directory when using shutil.copytree?

Question:

Is there a way I can filter a directory by using the absolute path to it?

shutil.copytree(directory,
                target_dir,
                ignore = shutil.ignore_patterns("/Full/Path/To/aDir/Common")) 

This doesn’t seem to work when trying to filter the “Common” Directory located under “aDir“. If I do this:

shutil.copytree(directory,
                target_dir,
                ignore = shutil.ignore_patterns("Common"))

It works, but every directory called Common will be filtered in that “tree”, which is not what I want.

Any suggestions ?

Thanks.

Asked By: Goles

||

Answers:

You can make your own ignore function:

shutil.copytree('/Full/Path', 'target',
              ignore=lambda directory, contents: ['Common'] if directory == '/Full/Path/To/aDir' else [])

Or, if you want to be able to call copytree with a relative path:

import os.path
def ignorePath(path):
  def ignoref(directory, contents):
    return (f for f in contents if os.abspath(os.path.join(directory, f)) == path)
  return ignoref

shutil.copytree('Path', 'target', ignore=ignorePath('/Full/Path/To/aDir/Common'))

From the docs:

If ignore is given, it must be a callable that will receive as its
arguments the directory being visited by copytree(), and a list of its
contents, as returned by os.listdir(). Since copytree() is called
recursively, the ignore callable will be called once for each
directory that is copied. The callable must return a sequence of
directory and file names relative to the current directory (i.e. a
subset of the items in its second argument); these names will then be
ignored in the copy process. ignore_patterns() can be used to create
such a callable that ignores names based on glob-style patterns.

Answered By: phihag

The API for shutil.ignore_patterns() doesn’t support absolute paths, but it is trivially easy to roll your own variant.

As a starting point, look at the source code for *ignore_patterns*:

def ignore_patterns(*patterns):
    """Function that can be used as copytree() ignore parameter.

    Patterns is a sequence of glob-style patterns
    that are used to exclude files"""
    def _ignore_patterns(path, names):
        ignored_names = []
        for pattern in patterns:
            ignored_names.extend(fnmatch.filter(names, pattern))
        return set(ignored_names)
    return _ignore_patterns

You can see that it returns a function that accepts a path and list of names, and it returns a set of names to ignore. To support your use case, create you own similar function that uses takes advantage of path argument. Pass your function to the ignore parameter in the call to copytree().

Alternatively, don’t use shutil as-is. The source code is short and sweet, so it isn’t hard to cut, paste, and customize.

Answered By: Raymond Hettinger

You’ll want to make your own ignore function, which checks the current directory being processed and returns a list containing ‘Common’ only if the dir is ‘/Full/Path/To/aDir’.

def ignore_full_path_common(dir, files):
    if dir == '/Full/Path/To/aDir':
        return ['Common']
    return []

shutil.copytree(directory, target_dir, ignore=ignore_full_path_common)
Answered By: Ethan Furman

Many Thanks for the answer. It helped me to design my own ignore_patterns() function for a bit different requirement. Pasting the code here, it might help someone.

Below is the ignore_patterns() function for excluding multiple files/directories using the absolute path to it.

myExclusionList –> List containing files/directories to be excluded while copying. This list can contain wildcard pattern. Paths in the list are relative to the srcpath provided. For ex:

[EXCLUSION LIST]

java/app/src/main/webapp/WEB-INF/lib/test
unittests
python-buildreqs/apps/abc.tar.gz
3rd-party/jdk*

Code is pasted below

def copydir(srcpath, dstpath, myExclusionList, log):

    patternlist = []
    try:
        # Forming the absolute path of files/directories to be excluded
        for pattern in myExclusionList:
            tmpsrcpath = join(srcpath, pattern)
            patternlist.extend(glob.glob(tmpsrcpath)) # myExclusionList can contain wildcard pattern hence glob is used
        copytree(srcpath, dstpath, ignore=ignore_patterns_override(*patternlist))
    except (IOError, os.error) as why:
        log.warning("Unable to copy %s to %s because %s", srcpath, dstpath, str(why))
        # catch the Error from the recursive copytree so that we can
        # continue with other files
    except Error as err:
        log.warning("Unable to copy %s to %s because %s", srcpath, dstpath, str(err))


# [START: Ignore Patterns]
# Modified Function to ignore patterns while copying.
# Default Python Implementation does not exclude absolute path
# given for files/directories

def ignore_patterns_override(*patterns):
    """Function that can be used as copytree() ignore parameter.
    Patterns is a sequence of glob-style patterns
    that are used to exclude files/directories"""
    def _ignore_patterns(path, names):
        ignored_names = []
        for f in names:
            for pattern in patterns:
                if os.path.abspath(join(path, f)) == pattern:
                    ignored_names.append(f)
        return set(ignored_names)
    return _ignore_patterns

# [END: Ignore Patterns]
Answered By: Ameen Ali Shaikh

Platform independent. Paths glob patterns [".gitkeep","app/build","*.txt"]

    def callbackIgnore(paths):
        """ callback for shutil.copytree """
        def ignoref(directory, contents):
            arr = [] 
            for f in contents:
                for p in paths:
                    if (pathlib.PurePath(directory, f).match(p)):
                        arr.append(f)
            return arr
    
        return ignoref
Answered By: Alexufo
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.