How to remove two chars from the beginning of a line
Question:
I’m a complete Python noob. How can I remove two characters from the beginning of each line in a file? I was trying something like this:
#!/Python26/
import re
f = open('M:/file.txt')
lines=f.readlines()
i=0;
for line in lines:
line = line.strip()
#do something here
Answers:
line = line[2:]
String slicing will help you:
>>> a="Some very long string"
>>> a[2:]
'me very long string'
Just use line[2:]
You were off to a good start. Try this in your loop:
for line in lines:
line = line[2:]
# do something here
The [2:] is called “slice” syntax, it essentially says “give me the part of this sequence which begins at index 2 and continues to the end (since no end point was specified after the colon).
Instead of using a for loop, you might be happier with a a list comprehension:
[line[2:] for line in lines]
Just as a curiosity, do check the cut
unix tool.
$ cut -c2- filename
The slicing syntax for -c is quite similar to python’s.
Just as a tip, you can shorten your program to
for line in open('M:/file.txt'):
line = line[2:]
And if you need to carry the line number too, use
for i, line in enumerate(open('M:/file.txt.')):
line = line[2:]
for line in open("file"):
print line[2:]
If you want to modify a file’s contents, not just process the string, try fileinput
‘s inplace
parameter:
# strip_2_chars.py
import fileinput
for line in fileinput.input(inplace=1):
print line[2:]
Then, on the command line:
python strip_2_chars.py m:file.txt
You’ll find python has some great ways to deal with strings. Some other useful string methods you might want to check out are ones like split(), replace(), and startswith()/endswith().
It may be interesting to know that there is a subtle, yet important difference between:
file = open( filename )
lines = file.readlines()
for line in lines:
do something
and
file = open( filename )
for line in file:
do something
The first solution (with readlines
) will load the whole contents of the file in memory and return a python list (of strings). On the other hand, the second solution makes use of something that’s called a iterator
. This will in fact move the pointer in the file as needed and return a string. This has an important benefit: The file is not loaded into memory. For small files, both approaches are OK. But as long as you only work with the file line-by-line, I suggest to use the iterator behavior directly.
So my solution would be:
infile = open( filename )
outfile = open( "%s.new" % filename, "w" )
for line in infile:
outfile.write( line[2:] )
infile.close()
outfile.close()
Come to think of it: If it’s a non-ascii file (for example latin-1 encoded), consider using codecs.open. Otherwise you may have a nasty surprise as you may accidentally cut a multibyte character in half 😉
However, if you don’t need python, and the only thing you need to do is crop the first two characters from the file, then the most efficient way to do it is kch’s suggestion and use cut
:
cat filename | cut -d2- > newfile
For these kinds of quick-and-dirty file operations I always have cygwin installed on my non-linux boxen. But I believe there’s a set of windows binaries for these tools as well which would perform faster than in cygwin iirc.
Since you are learning Python, I would like to add that, given the tools offered by Python (slicing, split, replace and all the other which have been mention), you will find that for many tasks regex are overkill. So the
import re
at the beginning of your example may or may not be strictly needed.
Not very efficient but yes it does work. Looks pretty complex.
print line[-(len(line)-2):]
I’m a complete Python noob. How can I remove two characters from the beginning of each line in a file? I was trying something like this:
#!/Python26/
import re
f = open('M:/file.txt')
lines=f.readlines()
i=0;
for line in lines:
line = line.strip()
#do something here
line = line[2:]
String slicing will help you:
>>> a="Some very long string"
>>> a[2:]
'me very long string'
Just use line[2:]
You were off to a good start. Try this in your loop:
for line in lines:
line = line[2:]
# do something here
The [2:] is called “slice” syntax, it essentially says “give me the part of this sequence which begins at index 2 and continues to the end (since no end point was specified after the colon).
Instead of using a for loop, you might be happier with a a list comprehension:
[line[2:] for line in lines]
Just as a curiosity, do check the cut
unix tool.
$ cut -c2- filename
The slicing syntax for -c is quite similar to python’s.
Just as a tip, you can shorten your program to
for line in open('M:/file.txt'):
line = line[2:]
And if you need to carry the line number too, use
for i, line in enumerate(open('M:/file.txt.')):
line = line[2:]
for line in open("file"):
print line[2:]
If you want to modify a file’s contents, not just process the string, try fileinput
‘s inplace
parameter:
# strip_2_chars.py
import fileinput
for line in fileinput.input(inplace=1):
print line[2:]
Then, on the command line:
python strip_2_chars.py m:file.txt
You’ll find python has some great ways to deal with strings. Some other useful string methods you might want to check out are ones like split(), replace(), and startswith()/endswith().
It may be interesting to know that there is a subtle, yet important difference between:
file = open( filename )
lines = file.readlines()
for line in lines:
do something
and
file = open( filename )
for line in file:
do something
The first solution (with readlines
) will load the whole contents of the file in memory and return a python list (of strings). On the other hand, the second solution makes use of something that’s called a iterator
. This will in fact move the pointer in the file as needed and return a string. This has an important benefit: The file is not loaded into memory. For small files, both approaches are OK. But as long as you only work with the file line-by-line, I suggest to use the iterator behavior directly.
So my solution would be:
infile = open( filename )
outfile = open( "%s.new" % filename, "w" )
for line in infile:
outfile.write( line[2:] )
infile.close()
outfile.close()
Come to think of it: If it’s a non-ascii file (for example latin-1 encoded), consider using codecs.open. Otherwise you may have a nasty surprise as you may accidentally cut a multibyte character in half 😉
However, if you don’t need python, and the only thing you need to do is crop the first two characters from the file, then the most efficient way to do it is kch’s suggestion and use cut
:
cat filename | cut -d2- > newfile
For these kinds of quick-and-dirty file operations I always have cygwin installed on my non-linux boxen. But I believe there’s a set of windows binaries for these tools as well which would perform faster than in cygwin iirc.
Since you are learning Python, I would like to add that, given the tools offered by Python (slicing, split, replace and all the other which have been mention), you will find that for many tasks regex are overkill. So the
import re
at the beginning of your example may or may not be strictly needed.
Not very efficient but yes it does work. Looks pretty complex.
print line[-(len(line)-2):]