Convert lines of values in text file to a CSV

Question:

Here is my data from a text file:

foo no 3.61 9.51
bar -26 67 2.35 0.50 70
baz dxo 6 2.37
quez -9.07 51
udx 9.66 9.66
scz -8
pdu hk 0.50 -5.34 -13 0.50
ytrs bl 1.27 -63 -0.66
xle 29 1.27 -8.19
drl -9.68 8.84 -29 -13 -86
kos -5.34 51 91
wen di 9.84 -2.35 88

and I need to output dataframe like this:
enter image description here

Asked By: mackuo

||

Answers:

CSVs aren’t much different from the format you already have. The only difference is that you need a header line and each value has to be separated by commas, not spaces.

Try this:

lines = [line.split(' ') for line in open('test.txt')]
with open('test.csv', 'w') as f:
    f.write(',' + ','.join(f'col_{i}' for i in range(1, len(max(lines, key=len)))) + 'n')
    f.writelines(','.join(line) for line in lines)

This code reads each line from the file test.txt, then opens a new CSV file test.csv. Then, it writes a header line with just enough columns for the longest line. Finally, it writes each line as comma separated.

Note that this won’t handle the lines that have values with spaces in them. For those edge cases, you could either change how convert it to a CSV, or just change those values to not have spaces to begin with. Changing the data is likely a better way, as having spaces can cause other issues later on.

Answered By: Michael M.

Michael M.’s answer is very good, however the question also allows for spaces in the first column. This is a slight modification to that to demonstrate the principle

# Get each line from the file and split it by spaces
lines = []
for line in open('test.txt'):
    # Get index of first character not in lowercase alphabet or a space
    # Store in variable i
    i = 0
    while line[i] in 'abcdefghijklmnopqrstuvwxyz ':
        i += 1
    # add list containing line up to i and
    # the numbers that have been split from the line after i
    lines.append([line[:i-1]] + line[i:].split(' '))

# Write the resulting data to a .csv file
# Copied from Michael's answer
with open('test.csv', 'w') as f:
    f.write(',' + ','.join(f'col_{i}' for i in range(1, len(max(lines, key=len)))) + 'n')
    f.writelines(','.join(line) for line in lines)
Answered By: not a tshirt
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.