Restructure text file details in the required output format using Python

Question:

I am working on a text file to fix it into the required o/p format using Python.

Input file:

[ADD BD]
'text1'

[ADD BD]
'text2' 

[ADD BD]
'text3' 

[ADD TLD]
'text4'

[ADD BRD]
'text5'

[ADD BRD]
'text6'

I want all the text details under a single title string [ADD xxx]

Required output format:

[ADD BD]
'text1'
'text2'
'text3'

[ADD TLD]
'text4'

[ADD BRD]
'text5'
'text6'

I believe this can be done using regex. I am trying this but unable to fix the empty entries created in the list. Also, unable to figure out the way I can add the title string. Here is my progress:

test0 = """
[ADD BD]
'text1'

[ADD BD]
'text2' 

[ADD BD]
'text3' 

[ADD TLD]
'text4'

[ADD BRD]
'text5'

[ADD BRD]
'text6
"""
#test1 = re.sub("[ADD BD]", "", test0)
#print(test1)

#print the lines below the line [ADD BD]
test2 = re.sub("[ADD TLD]", "", test0)
print(test2)
# list of test2 with name list_ADD_BD
list_ADD_BD = test2.split("n")
print(list_ADD_BD)

#remove empty lines from list_ADD_BD
list_ADD_BD = [x for x in list_ADD_BD if x]
print(list_ADD_BD)

How can I obtain the required o/p?

Asked By: Y. Pat

||

Answers:

If all your tags start with [ and all texts with ', then it’s simply checking the first character, and storing result in directory, to group by tag.

data = {}
curr_tag = ''
with open(filename, 'r') as f:
    for line in f.readlines():
        if line.startswith('['):
            curr_tag = line.strip()
            continue
        if line.startswith('''):
            if curr_tag not in data:
                data[curr_tag] = []
            data[curr_tag].append(line.strip())

for key in data:
    print(key)
    for text in data[key]:
        print(text)
Answered By: K.Mat

If test0 is having your string then you can achieve it by using defaultdict as it handles multiple values in list.

    from collections import defaultdict
    Res=""
    test_lst=test0.split("n")
    print(test_lst)
    d = defaultdict(list)
    for i in range(0,len(test_lst)):
        if "[" in test_lst[i]:
            d[test_lst[i]].append(test_lst[i+1])
            
    for key,value in d.items():
        Res=Res+key+"n"+str('n'.join([str(elem) for elem in value]))+"n"

    with open("myfile.txt","w") as f:
            f.write(Res)

Output:

[ADD BD]
'text1'
'text2' 
'text3' 

[ADD TLD]
'text4'

[ADD BRD]
'text5'
'text6
Answered By: Manjari