Try to find a sublist that doesnt occur in the range of ANY of the sublists in another list

Question:

enhancerlist=[[5,8],[10,11]]
TFlist=[[6,7],[24,56]]

I have two lists of lists. I am trying to isolate the sublists in my ‘TFlist’ that don’t fit in the range of ANY of the sublists of enhancerlist (by range: TFlist sublist range fits inside of enhancerlist sublist range).
SO for example, TFlist[1] will not occur in the range of any sublists in enhancerlist (whereas TFlist [6,7] fits inside the range of [5,8]) , so I want this as output:

TF_notinrange=[24,56]

the problem with a nested for loop like this:

while TFlist:
   TF=TFlist.pop()
   for j in enhancerlist: 
       if ((TF[0]>= j[0]) and (TF[1]<= j[1])):
           continue
           
       else: 
           TF_notinrange.append(TF)
 

is that I get this as output:
[[24, 56], [3, 4]]

the if statement is checking one sublist in enhancerlist at a time and so will append TF even if, later on, there is a sublist it is in the range of.

Could I somehow do a while loop with the condition? although it seems like I still have the issue of a nested loop appending things incorrectly ?

Asked By: Jillian Ness

||

Answers:

Alternative

Use a list comprehension:

TF_notinrange = [tf for tf in TFlist 
                 if not any(istart <= tf[0] <= tf[1] <= iend 
                            for istart, iend in enhancerlist)]
print(TF_notinrange)
>>> TF_notinrange

Explanation

Take ranges of TFlist which are not contained in any ranges of enhancerlist

Additional constraints

Add additional constraints by creating a multiline conditional by either:

  • Add backslash at end of line for line continuation
TF_notinrange = [tf for tf in TFlist 
                     if TFlist[0]==enhancerlist[0] and 
                         not any(istart <= tf[0] <= tf[1] <= iend)
                                for istart, iend in enhancerlist]
  • Enclosing in parens which allows line continuation
tF_notinrange = [tf for tf in TFlist 
                 if (TFlist[0]==enhancerlist[0] and
                     not any(istart <= tf[0] <= tf[1] <= iend) 
                             for istart, iend in enhancerlist)]

For Loop vs. List Comprehension

As commented by doejohn list comprehension is for simple code. For complex constraints, a for loop would be preferred due to readability.

tF_notinrange = []
for tf in TFlist:
    if (TFlist[0]==enhancerlist[0] and              # place multile constraints      
        not any(istart <= tf[0] <= tf[1] <= iend)):
        tF_notinrange.append(tf)
Answered By: DarrylG

You can use chained comparisons along with the less-common for-else block where the else clause triggers only if the for loop was not broken out of prematurely to achieve this:

non_overlapping = []

for tf_a, tf_b in TFlist:
    for enhancer_a, enhancer_b in enhancerlist:
        if enhancer_a <= tf_a < tf_b <= enhancer_b:
            break
    else:
        non_overlapping.append([tf_a, tf_b])

Note that this assumes that all pairs are already sorted and that no pair comprises a range of length zero (e.g., (2, 2)).

EDIT: OP, you are making some mistake somewhere in your testing.

In [1]: def non_overlapping(TFlist, enhancerlist):
   ...:     result = []
   ...:
   ...:     for tf_a, tf_b in TFlist:
   ...:         for enhancer_a, enhancer_b in enhancerlist:
   ...:             if enhancer_a <= tf_a < tf_b <= enhancer_b:
   ...:                 break
   ...:         else:
   ...:             result.append([tf_a, tf_b])
   ...:
   ...:     return result
   ...:

In [2]: enhancerlist=[[5,8],[10,15]]
   ...: TFlist=[[6,7],[11, 14], [54,56], [55,56]]

In [3]: non_overlapping(TFlist, enhancerlist)
Out[3]: [[54, 56], [55, 56]]  # [11, 14] is not present, as you claim
Answered By: ddejohn
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.