Reading .HEAD file in Python

Question:

I have a dataset of energy consumption for several households. The dataset is stored in .txt files and I can read them easily in Python. But the header is stored in another file. The extension of this file is .HEAD

So for each building, I have something like this:

processed-H01-Accounts-3-31-power.HEAD
processed-H01-Accounts-3-31-power-CLEAN.txt

Inside the file it looks like this:

# Created by Octave 3.8.0, Tue Jul 29 13:44:58 2014 BST
# name: h
# type: sq_string
# elements: 1
# length: 3051
timestamp   timestampWithDST    "Loughborough03,LBORO-SMART-020,00-0D-6F-00-00-F8-5C-A1,Freezer(Kitchen/utility room,Downstairs) / No of plugs(Landing,Upstairs)"   "Loughborough03,LBORO-SMART-032,00-0D-6F-00-00-F9-2C-9D,Fridge(Kitchen/utility room,Downstairs) / FridgeFreezer(Kitchen/utility room,Downstairs)"   "Loughborough03,LBORO-SMART-033,00-0D-6F-00-00-F9-2D-31,Battery Charger(Garage/Shed,Downstairs)"    "Loughborough03,LBORO-SMART-022,00-0D-6F-00-00-F9-2C-D5,Toaster(Kitchen/utility room,Downstairs)"   "Loughborough03,LBORO-SMART-027,00-0D-6F-00-00-F8-9F-32,Lamp 1(Bedroom 2,Upstairs)" "Loughborough03,LBORO-SMART-035,00-0D-6F-00-00-F8-5C-07,Computing Equipment(Bedroom 4,Upstairs)"    "Loughborough03,LBORO-SMART-021,00-0D-6F-00-00-F8-5B-FA,Microwave(Kitchen/utility room,Downstairs)" "Loughborough03,LBORO-SMART-029,00-0D-6F-00-00-F8-BE-1B,TV(Back Room,Downstairs) / Cable Decoder(Back Room,Downstairs) / Stereo(Back Room,Downstairs)"  "Loughborough03,LBORO-SMART-016,00-0D-6F-00-00-F9-2B-C6,Computing Equipment / Laptop(Front Room,Downstairs)"    "Loughborough03,LBORO-MET-010,00-0D-6F-00-00-C1-43-06,Small Power Down" "Loughborough03,LBORO-SMART-034,00-0D-6F-00-00-F8-BE-33,Dishwasher(Kitchen/utility room,Downstairs)"    "Loughborough03,LBORO-MET-008,00-0D-6F-00-00-C1-35-E1,Mains 1"

The last row is the column name of my dataset. I need to read these file and put them together in Python to do my modelling. IS there a way to convert this file format in python to a list?

Thanks

Asked By: Ashkan Lotfipoor

||

Answers:

Convert the .HEAD file to a list by splitting the last row by the tab character ‘t’

with open('processed-H01-Accounts-3-31-power.HEAD', 'r') as f:
    lines = f.readlines()
column_names = lines[-1].split('t')
print(column_names)
Answered By: Petro

In your case, because you are only interested in the last line of the file, you can first go to the end of the file and ONLY read the last line, then split the line based on the tab character:

with open('processed-H01-Accounts-3-31-power.HEAD', 'r') as f:
  # Move the file pointer to the end of the file
  f.seek(0, 2)

  # Read the last line of the file
  last_line = f.readline()

# Split the line by the tab character
columns = last_line.split('t')

print(column_names)
Answered By: Harith
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.