How to get value based on key for the text file

Question:

I have below text format in text file

; Generated for TDD

[Document]
Mainline                = PQRS.N - Holdings Corp conference call, Jun. 22, 2006 / 10:00AM ET
DocDate                 = 20060622100000
CtbDocUID               = T234567
Action                  = ADD
CtbId                   = 4567
UserId                  = ftp_contribution
Password                = PASSWORD
Attachment.MainFile.CtbName     = T_1234.xml
SynopsisFile.CtbName            = T_1234.tdm
MainLanguage                = en
WorldReg[0].MxpCode         = NAM
Country[0].MxpCode          = USA
Currency[0].MxpCode         = USD
Distribution.GroupID[0]         = 3
Author[0].MxpCode           = 5GOW
[EndOfFile]

Left side is key and Right side is value .
We need a way to get the values based on the key .
So when we provide DocDate as key we should get value as 20060622100000

Not sure how to do this .
spacing is not uniform and fixed its just the key will be fixed always .

Please suggest a way
Either Java or Python or May be regex is also fine .

Asked By: Atharv Thakur

||

Answers:

Can be done like this:

import json

INPUT_FILE = 'test.txt'

with open(INPUT_FILE) as f:
    lines = f.readlines()

data = {}
for line in lines:
    parts = line.split('=')
    if len(parts) == 2:
        data[parts[0].strip()] = parts[1].strip()

print(json.dumps(data, indent='  '))

Result:

{
  "Mainline": "PQRS.N - Holdings Corp conference call, Jun. 22, 2006 / 10:00AM ET",
  "DocDate": "20060622100000",
  "CtbDocUID": "T234567",
  "Action": "ADD",
  "CtbId": "4567",
  "UserId": "ftp_contribution",
  "Password": "PASSWORD",
  "Attachment.MainFile.CtbName": "T_1234.xml",
  "SynopsisFile.CtbName": "T_1234.tdm",
  "MainLanguage": "en",
  "WorldReg[0].MxpCode": "NAM",
  "Country[0].MxpCode": "USA",
  "Currency[0].MxpCode": "USD",
  "Distribution.GroupID[0]": "3",
  "Author[0].MxpCode": "5GOW"
}

Do you also need to infer data types?

Answered By: Michal Racko

try this (java)

    @Test
    public void t() throws Exception {
        Path path = Paths.get("/path/to/fileTest.txt");
        Stream<String> lines = Files.lines(path);
        String data = lines.collect(Collectors.joining("n"));
        lines.close();

        Map<String, String> result = Arrays.stream(data.split("n"))
                .filter(row -> row.contains("="))
                .map(row -> row.split("="))
                .collect(Collectors.toMap(
                        a -> a[0].trim(),  //key
                        a -> a[1].trim()   //value
                ));
        assertThat(result.get("DocDate"),  Is.is("20060622100000"));
        assertThat(result.get("Mainline"),  Is.is("PQRS.N - Holdings Corp conference call, Jun. 22, 2006 / 10:00AM ET"));
    }
Answered By: Zipora Tannboim

Please suggest a way(…)Python(..)fine

I suggest taking look at configparser, let file.txt content be

[Document]
Mainline                = PQRS.N - Holdings Corp conference call, Jun. 22, 2006 / 10:00AM ET
DocDate                 = 20060622100000
CtbDocUID               = T234567
Action                  = ADD
CtbId                   = 4567
UserId                  = ftp_contribution
Password                = PASSWORD
Attachment.MainFile.CtbName     = T_1234.xml
SynopsisFile.CtbName            = T_1234.tdm
MainLanguage                = en
WorldReg[0].MxpCode         = NAM
Country[0].MxpCode          = USA
Currency[0].MxpCode         = USD
Distribution.GroupID[0]         = 3
Author[0].MxpCode           = 5GOW
[EndOfFile]

then

import configparser
config = configparser.ConfigParser()
config.read("file.txt")
print(config["Document"]["DocDate"])

gives output

20060622100000

configparser is part of standard library so you do not have to install anything, if you want to know more read linked docs.

Answered By: Daweo

Python: Build a dictionary so that you can search for all/any keys. For example:

db = {}

with open('tdd.txt') as infile:
    for line in map(str.strip, infile):
        if len(t := line.split('=')) == 2:
            db[t[0].strip()] = t[1].strip()

print(db['DocDate'])

Output:

20060622100000
Answered By: Vlad
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.