Find a value within nested json dictionary in python

Question:

From the following json, in python, I’d like to extract the value “TEXT”. All the keys are constant except for unknown. Unknown could be any string like “a6784t66” or “hobvp*nfe”. The value of unknown is not known, only that it will be in that position in each json response.

{
  "A": {
    "B": {
      "unknown": {
        "1": "F",
        "maindata": [
          {
            "Info": "TEXT"
          }
        ]
      }
    }
  }
}

one line json

'{"A":{"B":{"unknown":{"1":"F","maindata":[{"Info":"TEXT"}]}}}}'

How would you get the value of “Text”? (I know how to load the json with json.loads)..but I’m not sure how to get the value of “Text”. Thanks.

(I’m not sure what the best title is.)

Asked By: user1959942

||

Answers:

It is a bit lenghty, but in that example above:

In [1]: import json

In [2]: s = """
   ...: {
   ...:   "A": {
   ...:     "B": {
   ...:       "unknown": {
   ...:         "1": "F",
   ...:         "maindata": [
   ...:           {
   ...:             "Info": "TEXT"
   ...:           }
   ...:         ]
   ...:       }
   ...:     }
   ...:   }
   ...: }"""

In [3]: data = json.loads(s)

In [4]: data['A']['B']['unknown']['maindata'][0]['Info']
Out[4]: u'TEXT'

You basically treat it as a dictionary, passing the keys to get the values of each nested dictionary. The only different part is when you hit maindata, where the resulting value is a list. In order to handle that, we pull the first element [0] and then access the Info key to get the value TEXT.

In the case of unknown changing, you would replace it with a variable that represents the ‘known’ name it will take at that point in your code:

my_variable = 'some_name'
data['A']['B'][my_variable]['maindata'][0]['Info']

And if I would have actually read your question properly the first time, if you don’t know what unknown is at any point, you can do something like this:

data['A']['B'].values()[0]['maindata'][0]['Info']

Where values() is a variable containing:

[{u'1': u'F', u'maindata': [{u'Info': u'TEXT'}]}]

A single-item list that can be accessed with [0] and then you can proceed as above. Note that this is dependent on there only being one item present in that dictionary – you would need to adjust a bit if there were more.

Answered By: RocketDonkey

As you said that unknown was at a fixed place
You can do the following

import json
s=json.loads('{"A":{"B":{"unknown":{"1":"F","maindata":[{"Info":"TEXT"}]}}}}')
i=s["A"]["B"].keys()
x=i[0]   # Will store 'unknown' in x, whatever unknown is
print s['A']['B'][x]['maindata'][0]['Info']    #here x dictionary index is used after B as its value will be the value for unknown

This should do the job, since only the unknown key is really ‘unknown’

Answered By: minocha

You can use a recursive function to dig through every layer and print its value with an indent

def recurse_keys(df, indent = '  '):
    ''' 
    import json, requests, pandas
    r = requests.post(...)  
    rj = r.json() # json decode results query
    j = json.dumps(rj, sort_keys=True,indent=2)            
    df1 = pandas.read_json(j)         
    '''
    for key in df.keys():
        print(indent+str(key))
        if isinstance(df[key], dict):
            recurse_keys(df[key], indent+'   ')
recurse_keys(df1)
Answered By: nagordon