Need to exchange two sets of tabular data in a TXT file using Python

Question:

I have a set of coordinate data (saved in a txt file), like this:

Number  N coordinates   E coordinates   Height  Code
1       5111945.980     444258.900      131.300 C
2       5265566.655     443665.554      110.311 BR  
...

It goes on for several dozen rows like that.
What I need done is to switch places of N coordinates with the ones on E coordinates (the header title also, of course) while everything else remains the same. For example, this would be the desired outcome in an output txt file:

Number  E coordinates   N coordinates   Height  Code
1       444258.900      5111945.980     131.300 C
2       443665.554      5265566.655     110.311 BR 

For now, I only have the bit for printing out the table data as it shows in txt:

# open the source file
myfile = open("sometext.txt", "r")

# print out the data, to check if output working
for line in myfile:
    print(line)

The data is tab spaced, as far as I can see.

I would like to solve this without external libraries, only default Python and its standard library.

Asked By: Wolf359

||

Answers:

Ok, so one way to accomplish it would be using the pandas library and the StringIO library (Notice that pandas is a library that you need to install with pip install pandas),

As you stated in the comments your txt file is a tab delimiter which means t is what separates your columns, here is one way to accomplish that:

Import libraries and load your file to a variable:

import pandas as pd 
from io import StringIO

txt_path = dir_path + "/sample_txt_file.txt"
string = ""
with open(txt_path, 'r') as file:
    string = file.read()

With your dummy data the variable string will look like that:

'NumbertN coordinatestE coordinatestHeighttCoden1 t5111945.980t444258.900t131.300tCn2 t5265566.655t443665.554t110.311tBR'

Read it as a dataframe:

df = pd.read_csv(StringIO(string), sep='t')

Reorder the column based on the columns you specify:

df = df[['Number', 'E coordinates', 'N coordinates', 'Height', 'Code']]

Note, you can reorder as many of them as you want

And save your data to a new txt file:

df.to_csv(new_txt_path, index=None, sep='t')

Resulting data will be like that:

Number  E coordinates   N coordinates   Height  Code
1       444258.900      5111945.980     131.300 C
2       443665.554      5265566.655     110.311 BR 

Solution with no third-party libraries

rows_to_be_switched = ['Number', 'Code']

def flip(row_to_flip, ind1, ind2):
    tmp = row_to_flip[ind1]
    row_to_flip[ind1] = row_to_flip[ind2]
    row_to_flip[ind2] = tmp
    return row_to_flip


with open('file1.txt', 'r') as file1, open('file2.txt', 'w') as file2:
    data_rows = [row.rstrip('n') for row in file1.readlines()]
    titles_indices = [data_rows[0].split('t').index(title) for title in rows_to_be_switched]

    for row in data_rows:
        file2.write('t'.join(flip(row.split('t'), *titles_indices) + ['n']))

Explanation:

  • The rows_to_be_switched is a list of two titles that we flip between.
  • The flip functions flips the indices of the items in a list and returns the new list
  • titles_indices = [data_rows[0].split('t').index(title) for title in rows_to_be_switched] Detects the indices of the titles, and knows where to flip each two items in a row.
  • We then write each row by joining the items with a tab, as it was the delimiter.
  • Edit thanks to @tripleee – we need to strip the line breaks as it is a part of the string, thus .index() wouldn’t find last columns. Then we add the n back to the end of the line.

Input:

file1.txt

Number  N coordinates   E coordinates   Height  Code
1   5111945.980 444258.900  131.300 C
2   5265566.655 443665.554  110.311 BR

Output:

file2.txt

Number  E coordinates   N coordinates   Height  Code
1   444258.900  5111945.980 131.300 C
2   443665.554  5265566.655 110.311 BR
Answered By: no_hex

You can also utilize the power of the f-string formatting capability and do the following:

l = 0
for line in myfile:
    sl = [x for x in line.split(' ') if x]
    if l == 0:
        print(line) # prints first line without a swap
    else:
        print(f'{sl[0]:8}{sl[2]:16}{sl[1]:14}{sl[3]:10}{sl[4]:4}')
    l += 1
Answered By: itprorh66
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.