# Find overlapping numeric ranges between 2 columns pandas and subtract difference of another column

## Question:

This is a toy dataset:

```
df = pd.DataFrame({'ID': ['A','A','A','A'],
'target': ['B','B','B','B'],
'length':[208,315,1987,3775],
'start':[139403,140668,141726,143705],
'end':[139609,140982,143711,147467]})
ID target length start end
0 A B 208 139403 139609
1 A B 315 140668 140982
2 A B 1987 141726 143711
3 A B 3775 143705 147467
```

I need to sum the length column taking overlapping ranges in start and end into consideration.

In row 3, the start value 143705 is between the ranges of row 2’s 141726-143711.

143711 – 143705 = 6 so I need to subtract 6 from the length in row 3: 3775 – 6 = 3769.

Which would change the df to:

```
ID target length start end
0 A B 208 139403 139609
1 A B 315 140668 140982
2 A B 1987 141726 143711
3 A B 3769 143705 147467
```

Then grouping by ID, target and summing length:

```
df.groupby(['ID','target'])['length'].sum()
ID target
A B 6279
Name: length, dtype: int64
```

Does anyone know how I can do this in a pythonic way? Thanks so much for any help!

## Answers:

A possible solution:

```
(df.assign(length=
df['start'].lt(df['end'].shift())
.mul(df['start']-df['end'].shift(fill_value=0))
.add(df['length'])))
```

Output:

```
ID target length start end
0 A B 208 139403 139609
1 A B 315 140668 140982
2 A B 1987 141726 143711
3 A B 3769 143705 147467
```