Pythonic way to ignore for loop control variable

Question:

A Python program I’m writing is to read a set number of lines from the top of a file, and the program needs to preserve this header for future use. Currently, I’m doing something similar to the following:

header = ''
header_len = 4
for i in range(1, header_len):
    header += file_handle.readline()

Pylint complains that I’m not using the variable i. What would be a more pythonic way to do this?

Edit: The purpose of the program is to intelligently split the original file into smaller files, each of which contains the original header and a subset of the data. So, I need to read and preserve just the header before reading the rest of the file.

Asked By: GreenMatt

||

Answers:

May be this:

header_len = 4
header = open("file.txt").readlines()[:header_len]

But, it will be troublesome for long files.

Answered By: mshsayem

I’m not sure what the Pylint rules are, but you could use the ‘_’ throwaway variable name.

header = ''
header_len = 4
for _ in range(1, header_len):
    header += file_handle.readline()
Answered By: David Claridge
import itertools

header_lines = list(itertools.islice(file_handle, header_len))
# or
header = "".join(itertools.islice(file_handle, header_len))

Note that with the first, the newline chars will still be present, to strip them:

header_lines = list(n.rstrip("n")
                    for n in itertools.islice(file_handle, header_len))
Answered By: Roger Pate

My best answer is as follows:

file test.dat:

This is line 1
This is line 2
This is line 3
This is line 4
This is line 5
This is line 6
This is line 7
This is line 8
This is line 9

Python script:

f = open('test.dat')
nlines = 4
header = "".join(f.readline() for _ in range(nlines))

Output:

>>> header
'This is line 1nThis is line 2nThis is line 3nThis is line 4n'

Notice that you don’t need to call any modules; also that you could use any dummy variable in place of _ (it works with i, or j, or ni, or whatever) but I recomend you don’t (to avoid confusion). You could strip the newline characters (though I don’t recommend you do – this way you can distinguish among lines) or do anything that you can do with strings in Python.

Notice that I did not provide a mode for opening the file, so it defaults to “read only” – this is not Pythonic; in Python “explicit is better than implicit”. Finally, nice people close their files; in this case it is automatic (because the script ends) but it is best practice to close them using f.close().

Happy Pythoning.

Edit: As pointed out by Roger Pate the square brackets are unnecessary in the list comprehension, thereby reducing the line by two characters. The original script has been edited to reflect this.

Answered By: Escualo
s=""
f=open("file")
for n,line in enumerate(f):
  if n<=3 : s=s+line
  else:
      # do something here to process the rest of the lines          
print s
f.close()
Answered By: ghostdog74

I do not see any thing wrong with your solution, may be just replace i with _, I also do not like invoking itertools everywhere where simpler solution will work, it is like people using jQuery for trivial javascript tasks. anyway just to have itertools revenge here is my solution

as you want to read whole file anyway line by line, why not just first read header and after that do whatever you want to do

header = ''
header_len = 4

for i, line in enumerate(file_handle):
    if i < header_len:
        header += line
    else:
        # output chunks to separate files
        pass

print header
Answered By: Anurag Uniyal

What about:

header = []
for i,l in enumerate(file_handle):
    if i <= 3: 
         header += l
         continue
    #proc rest of file here
Answered By: Claudiu
f = open('fname')
header = [next(f) for _ in range(header_len)]

Since you’re going to write header back to the new files, you don’t need to do anything with it. To write it back to the new file:

open('new', 'w').writelines(header + list_of_lines)

if you know the number of lines in the old file, list_of_lines would become:

list_of_lines = [next(f) for _ in range(chunk_len)]
Answered By: SilentGhost

One problem with using _ as a dummy variable is that it only solves the problem on one level, consider something like the following.

def f(n, m):
"""A function to run g() n times and run h() m times per g."""
    for _ in range(n):
        g()
        for _ in range(m):
            h()
    return 0

This function works fine but the _ iterator over m runs is problematic as it may conflict with the upper _. In any case PyCharm is complaining about this kind of syntax.

So I would argue that _ is not as “throwaway” as was suggested before.

Perhaps you might like to just create a function to do it!

def run(f, n, *args):
    """Runs f with the arguments from the args tuple n times."""
    for _ in range(n):
        f(*args)

e.g. you could use it like this:

>>> def ft(x, L):
...     L.append(x)

>>> a = 7
>>> nums = [4, 1]
>>> run(ft, 10, a, nums)
>>> nums
[4, 1, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7]
Answered By: Gregory Fenn
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.