How to split and read raw data into different numpy arrays based on delimeter parameter

Question:

I have a raw data in the following form

#######
#######
#col1 #col2 #col3
1       10    100
2       11    150
3       14    155
#######
#######
#######
#######
#col1 #col2 #col3
1       14    100
2       17    180
3       14    155
#######
#######
#######
#######
#col1 #col2 #col3
1       19    156
2       27    130
3       24    152
#######
#######

I want to load this data into a NumPy array. When I load this using numpy.loadtxt the entire data is being loaded into a single array. Is there an easier way to split this data into different chunks based on the ####### lines?

Asked By: brownfox

||

Answers:

A simple way to do it would be to read the file, split the obtained string at the separators, clean the remaining unnecessary lines and use numpy.loadtext on these lists of strings. (As explained in the documentation, lists of strings as parameters in numpy.loadtext are treated as lines)

import numpy as np
from typing import List

filename: str = "data_file.txt" # Put your filename here instead

with open(filename, "r", encoding="utf-8") as file:
    content: str = file.read()

datas: List[str] = content.split(4 * "#######n")
arrays: List[np.ndarray] = []
for data in datas:
    data_list: List[str] = data.replace("#######n", "").split("n")
    arrays.append(np.loadtxt(data_list))
Answered By: GregoirePelegrin
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.