# Select multiple sections of rows by index in pandas

## Question:

I have large DataFrame with GPS path and some attributes. A few sections of the path are those which I need to analyse. I would like to subset only those sections to a new DataFrame. I can subset one section at the time but the idea is to have them all and to have an original index.

The problem is similar to:

```
import pandas as pd
df = pd.DataFrame({'A':[0,1,2,3,4,5,6,7,8,9],'B':['a','b','c','d','e','f','g','h','i','j']},
index=range(10,20,))
```

I want o get something like:

```
cdf = df.loc[[11:13] & [17:20]] # SyntaxError: invalid syntax
```

desired outcome:

```
A B
11 1 b
12 2 c
13 3 d
17 7 h
18 8 i
19 9 j
```

I know the example is easy with `cdf = df.loc[[11,12,13,17,18,19],:]`

but in the original problem I have thousands of lines and some entries already removed, so listing points is rather not an option.

## Answers:

One possible solution with `concat`

:

```
cdf = pd.concat([df.loc[11:13], df.loc[17:20]])
print (cdf)
A B
11 1 b
12 2 c
13 3 d
17 7 h
18 8 i
19 9 j
```

Another solution with `range`

:

```
cdf = df.loc[list(range(11,14)) + list(range(17,20))]
print (cdf)
A B
11 1 b
12 2 c
13 3 d
17 7 h
18 8 i
19 9 j
```

You could use `np.r_`

to concatenate the slices:

```
In [16]: df.loc[np.r_[11:13, 17:20]]
Out[16]:
A B
11 1 b
12 2 c
17 7 h
18 8 i
19 9 j
```

Note, however, that

`df.loc[A:B]`

selects labels `A`

through `B`

with `B`

included.

`np.r_[A:B]`

returns an array of `A`

through `B`

with `B`

excluded. To include `B`

you would need to use `np.r_[A:B+1]`

.

When passed a slice, such as `df.loc[A:B]`

, `df.loc`

ignores labels that are not in `df.index`

. In contrast, when passed an array, such as `df.loc[np.r_[A:B]]`

, `df.loc`

may add a new row filled with NaNs for each value in the array which is not in `df.index`

.

Thus to produce the desired result, you would need to adjust the right endpoint of the slices and use `isin`

to test for membership in `df.index`

:

```
In [26]: df.loc[df.index.isin(np.r_[11:14, 17:21])]
Out[26]:
A B
11 1 b
12 2 c
13 3 d
17 7 h
18 8 i
19 9 j
```

One option is with pyjanitor select_rows – note that the selection is based on the label, not the integer position:

```
# pip install pyjanitor
import pandas as pd
df.select_rows(slice(11,13), slice(17,20))
A B
11 1 b
12 2 c
13 3 d
17 7 h
18 8 i
19 9 j
```