Function to find top list of items for a given list in a JSON input

Question:

I have a DataFrame like this:

| json_col                                           |
| ---------------------------------------------------|
| {"category":"a","items":["a","b","c","d","e","f"]} |
| {"category":"b","items":["u","v","w","x","y"]}     |
| {"category":"c","items":["p","q"]}                 |
| {"category":"d","items":["m"]}                     |

I converted it to strings of dicts:

x = pd.Series(', '.join(df_list['json_col'].to_list()), name='text')

The resultant is like below:

'{"category":"a","items":["a","b","c","d","e","f"]},
{"category":"b","items":["u","v","w","x","y"]},
{"category":"c","items":["p","q"]},
{"category":"d","items":["m"]}'

(EDIT: This was my original input when I posted the question but I have been pointed that it is not a right way to use JSON so I am providing the dataframe above.)

I am required to write a python function that takes an item as an input and return the top 3 items from the list where it belongs to (excluding itself). Items are in sequence of priority so top 3 is top first items.

def item_list(above_json_input, item = "a"):
    return list

For example the result list should follow the following rules:

  1. If the item is "a" then iterate through category – a where item a is present and return top 3 items in the sequence – ["b","c","d"]
  2. If the item is "w" then then iterate through category – b where item w is there and return – ["u","v","x"]
  3. If the item is "q" then look in category – c where item q is there and return – ["p"] because there are less than 3 top items other than q
  4. If the item is "m" then the returned list should look in category d where item q is there and return empty [] because there are no other items in that list to look for top items.

Same goes with an item which doesn’t exist like item = "r" which is not there in any category. We can throw an error or return an empty list again.

I am not sure how to read the json and get the list of top items. Is this even possible?

Asked By: trojan horse

||

Answers:

I fixed your JSON, as it was badly formatted. For input "c", ['a', 'b', 'd'] and ['p', 'q'] are printed:

import json

data_string = """{
        "data" : [
                {"category":"a","items":["a","b","c","d","e","f"]},
                {"category":"b","items":["u","v","w","x","y"]},
                {"category":"c","items":["p","q"]},
                {"category":"d","items":["m"]}
        ]
}"""

data = json.loads(data_string)["data"]

user_input = input("Pick a letter: ")

found = False
for values in data:
        if user_input in (values["category"], *values["items"]):
                found = True
                temp = [item for item in values["items"] if item != user_input]
                print(temp[:3])

if not found:
        print([])
Answered By: Jonathan Ciapetti

You could try this on your dataframe:

import pandas as pd

df = pd.DataFrame({'jsonCol':[{"g":[]}]})
h = df['jsonCol']


def search(inm):
    for item in h:
        if inm in item['items']:
            if len(item['items'])>3:
                item['items'].pop(item['items'].index(inm))
                return item['items'][:3]
            if len(item['items'])<3:
                item['items'].pop(item['items'].index(inm))
                return item['items']
    return []
        
print(search('r'))

edit:

h = [{"category":"a","items":["a","b","c","d","e","f"]},{"category":"b","items":["u","v","w","x","y"]},{"category":"c","items":["p","q"]},{"category":"d","items":["m"]}]

def search(inm):
    for item in h:
        if inm in item['items']:
            if len(item['items'])>3:
                item['items'].pop(item['items'].index(inm))
                return item['items'][:3]
            if len(item['items'])<3:
                item['items'].pop(item['items'].index(inm))
                return item['items']
    return []
        
print(search('b'))  # answer ['a', 'c', 'd']
Answered By: omar
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.