Remove entire array based on single item on list using regex

Question:

I have several list of lists (obtained from a for loop). Each one is composed like following:

[['8.761,00', '67.512,00', '0,00', '', ''], [None, '', '', '', ''], ['', 'Dovuto', 'Pagato', 'Rimborsato', 'Debito/Credito'], ['Soggettivo 15%', '1.314,15', '1.314,15', '0,00', '0,00'], ['Integrativo 4%', '2.700,48', '2.700,48', '0,00', '0,00'], ['Integrativo 2%', '0,00', '0,00', '0,00', '0,00'], ['Maternità', '113,20', '113,20', '0,00', '0,00'], ['Totale Contributi', '4.127,83', '4.127,83', '0,00', '0,00'], ['Interessi', '0,00', '0,00', '0,00', '0,00'], ['Sanzioni', '0,00', '0,00', '0,00', '0,00'], ['Totale', '4.127,83', '4.127,83', '0,00', '0,00'], ['', '', '', '', ''], ['Anno 2016 Reddito Prof.le', 'Vol. Affari 4% Vo', 'l. Affari 2%', '', ''], ['9.263,00', '85.149,00', '0,00', '', ''], [None, '', '', '', ''], ['', 'Dovuto', 'Pagato', 'Rimborsato', 'Debito/Credito'], ['Soggettivo 16%', '1.482,08', '1.482,08', '0,00', '0,00'], ['Integrativo 4%', '3.405,96', '3.405,96', '0,00', '0,00'], ['Integrativo 2%', '0,00', '0,00', '0,00', '0,00'], ['Maternità', '110,29', '110,29', '0,00', '0,00'], ['Totale Contributi', '4.998,33', '4.998,33', '0,00', '0,00'], ['Interessi', '0,00', '0,00', '0,00', '0,00'], ['Sanzioni', '0,00', '0,00', '0,00', '0,00'], ['Totale', '4.998,33', '4.998,33', '0,00', '0,00'], ['', '', '', '', '']]

I want only the entire lists that containing the words: "Soggettivo", "Integrativo", "Maternità", "Interessi", "Sanzioni", "Totale", "Totale Contributi" and removing all other entire lists.

The index of lists can be different and can change so I don’t think I can use index to remove it.
How can I do that? I tried using regex but isn’t work.

To be more clear, I want to delete lists that are like:

['Anno 2016 Reddito Prof.le', 'Vol. Affari 4% Vo', 'l. Affari 2%', '', '']

['9.263,00', '85.149,00', '0,00', '', '']

['', '', '', '', '']

And maintain only:

[['Soggettivo 15%', '1.314,15', '1.314,15', '0,00', '0,00'], ['Integrativo 4%', '2.700,48', '2.700,48', '0,00', '0,00'], ['Integrativo 2%', '0,00', '0,00', '0,00', '0,00'], ['Maternità', '113,20', '113,20', '0,00', '0,00'], ['Totale Contributi', '4.127,83', '4.127,83', '0,00', '0,00'], ['Interessi', '0,00', '0,00', '0,00', '0,00'], ['Sanzioni', '0,00', '0,00', '0,00', '0,00'], ['Totale', '4.127,83', '4.127,83', '0,00', '0,00'], ['Soggettivo 16%', '1.482,08', '1.482,08', '0,00', '0,00'], ['Integrativo 4%', '3.405,96', '3.405,96', '0,00', '0,00'], ['Integrativo 2%', '0,00', '0,00', '0,00', '0,00'], ['Maternità', '110,29', '110,29', '0,00', '0,00'], ['Totale Contributi', '4.998,33', '4.998,33', '0,00', '0,00'], ['Interessi', '0,00', '0,00', '0,00', '0,00'], ['Sanzioni', '0,00', '0,00', '0,00', '0,00'], ['Totale', '4.998,33', '4.998,33', '0,00', '0,00'], ]
Asked By: deidei

||

Answers:

You can try:

>>> a = [['8.761,00', '67.512,00', '0,00', '', ''], [None, '', '', '', ''], ['', 'Dovuto', 'Pagato', 'Rimborsato', 'Debito/Credito'], ['Soggettivo 15%', '1.314,15', '1.314,15', '0,00', '0,00'], ['Integrativo 4%', '2.700,48', '2.700,48', '0,00', '0,00'], ['Integrativo 2%', '0,00', '0,00', '0,00', '0,00'], ['Maternità', '113,20', '113,20', '0,00', '0,00'], ['Totale Contributi', '4.127,83', '4.127,83', '0,00', '0,00'], ['Interessi', '0,00', '0,00', '0,00', '0,00'], ['Sanzioni', '0,00', '0,00', '0,00', '0,00'], ['Totale', '4.127,83', '4.127,83', '0,00', '0,00'], ['', '', '', '', ''], ['Anno 2016 Reddito Prof.le', 'Vol. Affari 4% Vo', 'l. Affari 2%', '', ''], ['9.263,00', '85.149,00', '0,00', '', ''], [None, '', '', '', ''], ['', 'Dovuto', 'Pagato', 'Rimborsato', 'Debito/Credito'], ['Soggettivo 16%', '1.482,08', '1.482,08', '0,00', '0,00'], ['Integrativo 4%', '3.405,96', '3.405,96', '0,00', '0,00'], ['Integrativo 2%', '0,00', '0,00', '0,00', '0,00'], ['Maternità', '110,29', '110,29', '0,00', '0,00'], ['Totale Contributi', '4.998,33', '4.998,33', '0,00', '0,00'], ['Interessi', '0,00', '0,00', '0,00', '0,00'], ['Sanzioni', '0,00', '0,00', '0,00', '0,00'], ['Totale', '4.998,33', '4.998,33', '0,00', '0,00'], ['', '', '', '', '']]
>>> for data in a:
...   if data in(['Anno 2016 Reddito Prof.le', 'Vol. Affari 4% Vo', 'l. Affari 2%', '', ''], ['9.263,00', '85.149,00', '0,00', '', ''], ['', '', '', '', ''], ['8.761,00', '67.512,00', '0,00', '', ''], [None, '', '', '', ''], ['', 'Dovuto', 'Pagato', 'Rimborsato', 'Debito/Credito']):
...     continue
...   output.append(data)
...
>>> output
[['8.761,00', '67.512,00', '0,00', '', ''], [None, '', '', '', ''], ['', 'Dovuto', 'Pagato', 'Rimborsato', 'Debito/Credito'], ['Soggettivo 15%', '1.314,15', '1.314,15', '0,00', '0,00'], ['Integrativo 4%', '2.700,48', '2.700,48', '0,00', '0,00'], ['Integrativo 2%', '0,00', '0,00', '0,00', '0,00'], ['Maternità', '113,20', '113,20', '0,00', '0,00'], ['Totale Contributi', '4.127,83', '4.127,83', '0,00', '0,00'], ['Interessi', '0,00', '0,00', '0,00', '0,00'], ['Sanzioni', '0,00', '0,00', '0,00', '0,00'], ['Totale', '4.127,83', '4.127,83', '0,00', '0,00'], [None, '', '', '', ''], ['', 'Dovuto', 'Pagato', 'Rimborsato', 'Debito/Credito'], ['Soggettivo 16%', '1.482,08', '1.482,08', '0,00', '0,00'], ['Integrativo 4%', '3.405,96', '3.405,96', '0,00', '0,00'], ['Integrativo 2%', '0,00', '0,00', '0,00', '0,00'], ['Maternità', '110,29', '110,29', '0,00', '0,00'], ['Totale Contributi', '4.998,33', '4.998,33', '0,00', '0,00'], ['Interessi', '0,00', '0,00', '0,00', '0,00'], ['Sanzioni', '0,00', '0,00', '0,00', '0,00'], ['Totale', '4.998,33', '4.998,33', '0,00', '0,00']]
Answered By: Harsha Biyani
original_list = [['8.761,00', '67.512,00', '0,00', '', ''], [None, '', '', '', ''], ['', 'Dovuto', 'Pagato', 'Rimborsato', 'Debito/Credito'], ['Soggettivo 15%', '1.314,15', '1.314,15', '0,00', '0,00'], ['Integrativo 4%', '2.700,48', '2.700,48', '0,00', '0,00'], ['Integrativo 2%', '0,00', '0,00', '0,00', '0,00'], ['Maternità', '113,20', '113,20', '0,00', '0,00'], ['Totale Contributi', '4.127,83', '4.127,83', '0,00', '0,00'], ['Interessi', '0,00', '0,00', '0,00', '0,00'], ['Sanzioni', '0,00', '0,00', '0,00', '0,00'], ['Totale', '4.127,83', '4.127,83', '0,00', '0,00'], ['', '', '', '', ''], ['Anno 2016 Reddito Prof.le', 'Vol. Affari 4% Vo', 'l. Affari 2%', '', ''], ['9.263,00', '85.149,00', '0,00', '', ''], [None, '', '', '', ''], ['', 'Dovuto', 'Pagato', 'Rimborsato', 'Debito/Credito'], ['Soggettivo 16%', '1.482,08', '1.482,08', '0,00', '0,00'], ['Integrativo 4%', '3.405,96', '3.405,96', '0,00', '0,00'], ['Integrativo 2%', '0,00', '0,00', '0,00', '0,00'], ['Maternità', '110,29', '110,29', '0,00', '0,00'], ['Totale Contributi', '4.998,33', '4.998,33', '0,00', '0,00'], ['Interessi', '0,00', '0,00', '0,00', '0,00'], ['Sanzioni', '0,00', '0,00', '0,00', '0,00'], ['Totale', '4.998,33', '4.998,33', '0,00', '0,00'], ['', '', '', '', '']]

words_to_find = ["Soggettivo", "Integrativo", "Maternità", "Interessi", "Sanzioni", "Totale", "Totale Contributi"]

def list_contains_values(original_list, words_to_find):
    for original_word in original_list:
        if original_word in words_to_find:
            return True
    return False


b = list(
        filter(
            lambda li: list_contains_values(li, words_to_find), original_list 
        )
    )
print(b)
Answered By: Michael Matsaev

This approach only keeps sublists that contain one of your searched words as substring in one of their elements:

mandatory_words = ["Soggettivo", "Integrativo", "Maternità", "Interessi", "Sanzioni", "Totale", "Totale Contributi"]
filtered_lists = [l for l in lists
                  if l is not None
                  and sum([any(mandatory_word in e for e in l if isinstance(e, str))
                           for mandatory_word in mandatory_words])]

print(filtered_lists[:3])

[['Soggettivo 15%', '1.314,15', '1.314,15', '0,00', '0,00'],
 ['Integrativo 4%', '2.700,48', '2.700,48', '0,00', '0,00'],
 ['Integrativo 2%', '0,00', '0,00', '0,00', '0,00']]
Answered By: Erik Fubel
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.