replacing text in a file with Python

Question:

I’m new to Python. I want to be able to open a file and replace every instance of certain words with a given replacement via Python. as an example say replace every word ‘zero’ with ‘0’, ‘temp’ with ‘bob’, and say ‘garbage’ with ‘nothing’.

I had first started to use this:

for line in fileinput.input(fin):
        fout.write(line.replace('zero', '0'))
        fout.write(line.replace('temp','bob'))
        fout.write(line.replace('garbage','nothing'))

but I don’t think this is an even remotely correct way to do this. I then thought about doing if statements to check if the line contains these items and if it does, then replace which one the line contains, but from what I know of Python this also isn’t truly an ideal solution. I would love to know what the best way to do this. Thanks ahead of time!

Asked By: shadonar

||

Answers:

The essential way is

  • read(),
  • data = data.replace() as often as you need and then
  • write().

If you read and write the whole data at once or in smaller parts is up to you. You should make it depend on the expected file size.

read() can be replaced with the iteration over the file object.

Answered By: glglgl

This should do it

replacements = {'zero':'0', 'temp':'bob', 'garbage':'nothing'}

with open('path/to/input/file') as infile, open('path/to/output/file', 'w') as outfile:
    for line in infile:
        for src, target in replacements.items():
            line = line.replace(src, target)
        outfile.write(line)

EDIT: To address Eildosa’s comment, if you wanted to do this without writing to another file, then you’ll end up having to read your entire source file into memory:

lines = []
with open('path/to/input/file') as infile:
    for line in infile:
        for src, target in replacements.items():
            line = line.replace(src, target)
        lines.append(line)
with open('path/to/input/file', 'w') as outfile:
    for line in lines:
        outfile.write(line)

Edit: If you are using Python 2.x, use replacements.iteritems() instead of replacements.items()

Answered By: inspectorG4dget

I might consider using a dict and re.sub for something like this:

import re
repldict = {'zero':'0', 'one':'1' ,'temp':'bob','garage':'nothing'}
def replfunc(match):
    return repldict[match.group(0)]

regex = re.compile('|'.join(re.escape(x) for x in repldict))
with open('file.txt') as fin, open('fout.txt','w') as fout:
    for line in fin:
        fout.write(regex.sub(replfunc,line))

This has a slight advantage to replace in that it is a bit more robust to overlapping matches.

Answered By: mgilson

Faster way of writing it would be…

in = open('path/to/input/file').read()
out = open('path/to/input/file', 'w')
replacements = {'zero':'0', 'temp':'bob', 'garbage':'nothing'}
for i in replacements.keys():
    in = in.replace(i, replacements[i])
out.write(in)
out.close

This eliminated a lot of the iterations that the other answers suggest, and will speed up the process for longer files.

Answered By: Matt Olan

Reading from standard input, write ‘code.py’ as follows:

import sys

rep = {'zero':'0', 'temp':'bob', 'garbage':'nothing'}

for line in sys.stdin:
    for k, v in rep.iteritems():
        line = line.replace(k, v)
    print line

Then, execute the script with redirection or piping (http://en.wikipedia.org/wiki/Redirection_(computing))

python code.py < infile > outfile
Answered By: satomacoto

If your file is short (or even not extremely long), you can use the following snippet to replace text in place:

# Replace variables in file
with open('path/to/in-out-file', 'r+') as f:
    content = f.read()
    f.seek(0)
    f.truncate()
    f.write(content.replace('replace this', 'with this'))
Answered By: John Calcote

This is a short and simple example I just used:

If:

fp = open("file.txt", "w")

Then:

fp.write(line.replace('is', 'now'))
// "This is me" becomes "This now me"

Not:

line.replace('is', 'now')
fp.write(line)
// "This is me" not changed while writing
Answered By: AmazingDayToday
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.