Try to find a sublist that doesnt occur in the range of ANY of the sublists in another list
Question:
enhancerlist=[[5,8],[10,11]]
TFlist=[[6,7],[24,56]]
I have two lists of lists. I am trying to isolate the sublists in my ‘TFlist’ that don’t fit in the range of ANY of the sublists of enhancerlist (by range: TFlist sublist range fits inside of enhancerlist sublist range).
SO for example, TFlist[1] will not occur in the range of any sublists in enhancerlist (whereas TFlist [6,7] fits inside the range of [5,8]) , so I want this as output:
TF_notinrange=[24,56]
the problem with a nested for loop like this:
while TFlist:
TF=TFlist.pop()
for j in enhancerlist:
if ((TF[0]>= j[0]) and (TF[1]<= j[1])):
continue
else:
TF_notinrange.append(TF)
is that I get this as output:
[[24, 56], [3, 4]]
the if statement is checking one sublist in enhancerlist at a time and so will append TF even if, later on, there is a sublist it is in the range of.
Could I somehow do a while loop with the condition? although it seems like I still have the issue of a nested loop appending things incorrectly ?
Answers:
Alternative
Use a list comprehension:
TF_notinrange = [tf for tf in TFlist
if not any(istart <= tf[0] <= tf[1] <= iend
for istart, iend in enhancerlist)]
print(TF_notinrange)
>>> TF_notinrange
Explanation
Take ranges of TFlist which are not contained in any ranges of enhancerlist
Additional constraints
Add additional constraints by creating a multiline conditional by either:
- Add backslash at end of line for line continuation
TF_notinrange = [tf for tf in TFlist
if TFlist[0]==enhancerlist[0] and
not any(istart <= tf[0] <= tf[1] <= iend)
for istart, iend in enhancerlist]
- Enclosing in parens which allows line continuation
tF_notinrange = [tf for tf in TFlist
if (TFlist[0]==enhancerlist[0] and
not any(istart <= tf[0] <= tf[1] <= iend)
for istart, iend in enhancerlist)]
For Loop vs. List Comprehension
As commented by doejohn list comprehension is for simple code. For complex constraints, a for loop would be preferred due to readability.
tF_notinrange = []
for tf in TFlist:
if (TFlist[0]==enhancerlist[0] and # place multile constraints
not any(istart <= tf[0] <= tf[1] <= iend)):
tF_notinrange.append(tf)
You can use chained comparisons along with the less-common for-else
block where the else
clause triggers only if the for
loop was not broken out of prematurely to achieve this:
non_overlapping = []
for tf_a, tf_b in TFlist:
for enhancer_a, enhancer_b in enhancerlist:
if enhancer_a <= tf_a < tf_b <= enhancer_b:
break
else:
non_overlapping.append([tf_a, tf_b])
Note that this assumes that all pairs are already sorted and that no pair comprises a range of length zero (e.g., (2, 2)
).
EDIT: OP, you are making some mistake somewhere in your testing.
In [1]: def non_overlapping(TFlist, enhancerlist):
...: result = []
...:
...: for tf_a, tf_b in TFlist:
...: for enhancer_a, enhancer_b in enhancerlist:
...: if enhancer_a <= tf_a < tf_b <= enhancer_b:
...: break
...: else:
...: result.append([tf_a, tf_b])
...:
...: return result
...:
In [2]: enhancerlist=[[5,8],[10,15]]
...: TFlist=[[6,7],[11, 14], [54,56], [55,56]]
In [3]: non_overlapping(TFlist, enhancerlist)
Out[3]: [[54, 56], [55, 56]] # [11, 14] is not present, as you claim
enhancerlist=[[5,8],[10,11]]
TFlist=[[6,7],[24,56]]
I have two lists of lists. I am trying to isolate the sublists in my ‘TFlist’ that don’t fit in the range of ANY of the sublists of enhancerlist (by range: TFlist sublist range fits inside of enhancerlist sublist range).
SO for example, TFlist[1] will not occur in the range of any sublists in enhancerlist (whereas TFlist [6,7] fits inside the range of [5,8]) , so I want this as output:
TF_notinrange=[24,56]
the problem with a nested for loop like this:
while TFlist:
TF=TFlist.pop()
for j in enhancerlist:
if ((TF[0]>= j[0]) and (TF[1]<= j[1])):
continue
else:
TF_notinrange.append(TF)
is that I get this as output:
[[24, 56], [3, 4]]
the if statement is checking one sublist in enhancerlist at a time and so will append TF even if, later on, there is a sublist it is in the range of.
Could I somehow do a while loop with the condition? although it seems like I still have the issue of a nested loop appending things incorrectly ?
Alternative
Use a list comprehension:
TF_notinrange = [tf for tf in TFlist
if not any(istart <= tf[0] <= tf[1] <= iend
for istart, iend in enhancerlist)]
print(TF_notinrange)
>>> TF_notinrange
Explanation
Take ranges of TFlist which are not contained in any ranges of enhancerlist
Additional constraints
Add additional constraints by creating a multiline conditional by either:
- Add backslash at end of line for line continuation
TF_notinrange = [tf for tf in TFlist
if TFlist[0]==enhancerlist[0] and
not any(istart <= tf[0] <= tf[1] <= iend)
for istart, iend in enhancerlist]
- Enclosing in parens which allows line continuation
tF_notinrange = [tf for tf in TFlist
if (TFlist[0]==enhancerlist[0] and
not any(istart <= tf[0] <= tf[1] <= iend)
for istart, iend in enhancerlist)]
For Loop vs. List Comprehension
As commented by doejohn list comprehension is for simple code. For complex constraints, a for loop would be preferred due to readability.
tF_notinrange = []
for tf in TFlist:
if (TFlist[0]==enhancerlist[0] and # place multile constraints
not any(istart <= tf[0] <= tf[1] <= iend)):
tF_notinrange.append(tf)
You can use chained comparisons along with the less-common for-else
block where the else
clause triggers only if the for
loop was not broken out of prematurely to achieve this:
non_overlapping = []
for tf_a, tf_b in TFlist:
for enhancer_a, enhancer_b in enhancerlist:
if enhancer_a <= tf_a < tf_b <= enhancer_b:
break
else:
non_overlapping.append([tf_a, tf_b])
Note that this assumes that all pairs are already sorted and that no pair comprises a range of length zero (e.g., (2, 2)
).
EDIT: OP, you are making some mistake somewhere in your testing.
In [1]: def non_overlapping(TFlist, enhancerlist):
...: result = []
...:
...: for tf_a, tf_b in TFlist:
...: for enhancer_a, enhancer_b in enhancerlist:
...: if enhancer_a <= tf_a < tf_b <= enhancer_b:
...: break
...: else:
...: result.append([tf_a, tf_b])
...:
...: return result
...:
In [2]: enhancerlist=[[5,8],[10,15]]
...: TFlist=[[6,7],[11, 14], [54,56], [55,56]]
In [3]: non_overlapping(TFlist, enhancerlist)
Out[3]: [[54, 56], [55, 56]] # [11, 14] is not present, as you claim