How to copy files if they have certain strings in content (Python)?

Question:

I have a directory of text files. Here’s what I’m trying to do:

  1. Search for files with the word ‘head’ in their names.
  2. Search for a particular string in the content of the files found.
  3. If they have the string, copy the files into another directory.

I am new to Python, I tried the following code but receive the error mentioned below.

import shutil
import os

input_path = r'source path'
output_path = r'target path'

for file in os.listdir(input_path):
    input_file = os.path.join(input_path, file)
    output_file = os.path.join(output_path, file)
    if 'head' in file.lower():
        with open (input_file, 'r') as f:
            text = f.read_text()
            if 'search_string' in text:
                shutil.copy2(input_file, output_file)

            Else: print('None found!')

The error:

_io.TextIOWrapper' object has no attribute 'read_text'

I looked up in the docs and found read_text, so I wonder why I get this error.

UPDATE

I tried read(), it does the job for some files and then I get the following error:

Traceback (most recent call last):
  File "pathcopy files.py", line 12, in <module>
    text = f.read()
  File "C:UsersuserAppDataLocalProgramsPythonPython310libencodingscp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 27: character maps to <undefined>

Any idea how I can fix that?

Asked By: Leila

||

Answers:

read_text() is on Path objects. You have a file object, so use .read():

text = f.read()
Answered By: AKX

You should read the file content, read_text is not the correct function:

with open (input_file, 'r', encoding='utf-8') as f:
    text = f.read()
    if 'search_string' in text:
         shutil.copy2(input_file, output_file)
Answered By: svfat
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.