Remove duplicates and combine multiple lists into one?

Question:

How do I remove duplicates and combine multiple lists into one like so:

function([["hello","me.txt"],["good","me.txt"],["good","money.txt"], ["rep", "money.txt"]]) should return exactly:

[["good", ["me.txt", "money.txt"]], ["hello", ["me.txt"]], ["rep", ["money.txt"]]]
Asked By: user8618383

||

Answers:

Try This ( no library needed ):

your_input_data = [ ["hello","me.txt"], ["good","me.txt"], ["good","me.txt"], ["good","money.txt"], ["rep", "money.txt"] ]


my_dict = {}
for box in your_input_data:

    if box[0] in my_dict:

        buffer_items = []
        for items in box[1:]:
            if items not in my_dict[box[0]]:
                buffer_items.append(items)

        remove_dup = list(set(buffer_items + my_dict[box[0]]))
        my_dict[box[0]] = remove_dup

    else:

        buffer_items = []
        for items in box[1:]:
            buffer_items.append(items)

        remove_dup = list(set(buffer_items))

        my_dict[box[0]] = remove_dup


last_point = [[keys, values] for keys, values in my_dict.items()]

print(last_point)

Good Luck …

Answered By: DRPK

Create a empty array push the index 0 from childs arrays and join to convert all values to a string separate by space .

var your_input_data = [ ["hello","hi", "jel"], ["good"], ["good2","lo"], ["good3","lt","ahhahah"], ["rep", "nice","gr8", "job"] ];

var myprint = []
for(var i in your_input_data){
   myprint.push(your_input_data[i][0]);
}
console.log(myprint.join(' '))
Answered By: user8556290

The easiest one would be using defaultdict .

>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> for i,j in l: 
        d[i].append(j)                   #append value to the key
>>> d
=> defaultdict(<class 'list'>, {'hello': ['me.txt'], 'good': ['me.txt', 'money.txt'], 
                                'rep': ['money.txt']})

    #to get it in a list
>>> out = [ [key,d[key]] for key in d]
>>> out
=> [['hello', ['me.txt']], ['good', ['me.txt', 'money.txt']], ['rep', ['money.txt']]]

#driver values :

IN : l = [["hello","me.txt"],["good","me.txt"],["good","money.txt"], ["rep", "money.txt"]]
Answered By: Kaushik NP

You can do it with traditional dictionaries too.

In [30]: l1 = [["hello","me.txt"],["good","me.txt"],["good","money.txt"], ["rep", "money.txt"]]

In [31]: for i, j in l1:
    ...:     if i not in d2:
    ...:         d2[i] = j
    ...:     else:
    ...:         val = d2[i]
    ...:         d2[i] = [val, j]
    ...:         

In [32]: d2
Out[32]: {'good': ['me.txt', 'money.txt'], 'hello': 'me.txt', 'rep': 'money.txt'}

In [33]: out = [ [key,d1[key]] for key in d1]

In [34]: out
Out[34]: 
[['rep', ['money.txt']],
['hello', ['me.txt']],
['good', ['me.txt', 'money.txt']]]
Answered By: privatevoid

Let’s first understand the actual problem :

Example Hint :

For these types of list problems there is a pattern :

So suppose you have a list :

a=[(2006,1),(2007,4),(2008,9),(2006,5)]

And you want to convert this to a dict as the first element of the tuple as key and second element of the tuple. something like :

{2008: [9], 2006: [5], 2007: [4]}

But there is a catch you also want that those keys which have different values but keys are same like (2006,1) and (2006,5) keys are same but values are different. you want that those values append with only one key so expected output :

{2008: [9], 2006: [1, 5], 2007: [4]}

for this type of problem we do something like this:

first create a new dict then we follow this pattern:

if item[0] not in new_dict:
    new_dict[item[0]]=[item[1]]
else:
    new_dict[item[0]].append(item[1])

So we first check if key is in new dict and if it already then add the value of duplicate key to its value:

full code:

a=[(2006,1),(2007,4),(2008,9),(2006,5)]

new_dict={}

for item in a:
    if item[0] not in new_dict:
        new_dict[item[0]]=[item[1]]
    else:
        new_dict[item[0]].append(item[1])

print(new_dict)

Your actual problem solution :

list_1=[["hello","me.txt"],["good","me.txt"],["good","money.txt"], ["rep", "money.txt"]]

no_dublicates={}

for item in list_1:
    if item[0] not in no_dublicates:
        no_dublicates[item[0]]=["".join(item[1:])]
    else:
        no_dublicates[item[0]].extend(item[1:])

list_result=[]
for key,value in no_dublicates.items():
    list_result.append([key,value])
print(list_result)

output:

[['hello', ['me.txt']], ['rep', ['money.txt']], ['good', ['me.txt', 'money.txt']]]
Answered By: Aaditya Ura
yourList=[["hello","me.txt"],["good","me.txt"],["good","money.txt"], ["rep", "money.txt"]]
expectedList=[["good", ["me.txt", "money.txt"]], ["hello", ["me.txt"]], ["rep", ["money.txt"]]]

def getall(allsec, listKey, uniqlist):
    if listKey not in uniqlist:
        uniqlist.append(listKey)
        return [listKey, [x[1] for x in allsec if x[0] == listKey]]

uniqlist=[]
result=sorted(list(filter(lambda x:x!=None, [getall(yourList,elem[0],uniqlist) for elem in yourList])))
print(result)

hope this helps

Using Python to create a function that gives you the exact required output can be done as follows:

from collections import defaultdict
    
def function(data):    
    entries = defaultdict(list)
    
    for k, v in data:
        entries[k].append(v)
        
    return sorted([k, v] for k, v in entries.items())

print(function([["hello","me.txt"],["good","me.txt"],["good","money.txt"], ["rep", "money.txt"]]))  

The output is sorted before being returned as per your requirement. This would display the return from the function as:

[['good', ['me.txt', 'money.txt']], ['hello', ['me.txt']], ['rep', ['money.txt']]]  

It also ensures that the keys are sorted. A dictionary is used to deal with the removal of duplicates (as keys need to be unique).

A defaultdict() is used to simplify the building of lists within the dictionary. The alternative would be to try and append a new value to an existing key, and if there is a KeyError exception, then add the new key instead as follows:

def function(data):    
    entries = {}
    
    for k, v in data:
        try:
            entries[k].append(v)
        except KeyError as e:
            entries[k] = [v]
        
    return sorted([k, v] for k, v in entries.items())
Answered By: Martin Evans

This can easily be solved using dict and sets.

def combine_duplicates(given_list):
    data = {}
    for element_1, element_2 in given_list:
        data[element_1] = data.get(element_1, set()).add(element_2)
    return [[k, list(v)] for k, v in data.items()]
Answered By: N M
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.