pandas column-slices with mypy

Question:

Lately I’ve found myself in a strange situation I cannot solve for myself:

Consider this MWE:

import pandas
import numpy as np

data = pandas.DataFrame(np.random.rand(10, 5), columns=list("abcde"))

observations = data.loc[:, :"c"]
features = data.loc[:, "c":]

print(data)
print(observations)
print(features)

According to this Answer the slicing itself is done correct and it works in the sense that the correct results are printed.
But when I try to run mypy over it I get this error:

mypy.exe .t.py
t.py:1: error: Skipping analyzing "pandas": module is installed, but missing library stubs or py.typed marker
t.py:1: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
t.py:6: error: Slice index must be an integer or None
t.py:7: error: Slice index must be an integer or None
Found 3 errors in 1 file (checked 1 source file)

Which is also correct since the slicing is not done with an integer.
How can I either satisfy or disable the Slice index must be an integer or None error?

Of course you could use iloc(:,:3) to solve this problem, but this feels like a bad practice, since with iloc we are depending on the order of the columns (in this example loc is also dependent on the ordering, but this is only done to keep the MWE short).

Asked By: Someone2

||

Answers:

That’s an open issue (#GH2410).

As a workaround, you can maybe try with get_loc :

col_idx = data.columns.get_loc("c")
​
observations = data.iloc[:, :col_idx+1]
features = data.iloc[:, col_idx:]

Output :

           a         b         c # <- observations
0   0.269605  0.497063  0.676928
1   0.526765  0.204216  0.748203
2   0.919330  0.059722  0.422413
..       ...       ...       ...
7   0.056050  0.521702  0.727323
8   0.635477  0.145401  0.258166
9   0.041886  0.812769  0.839979

[10 rows x 3 columns]

           c         d         e  # <- features
0   0.676928  0.672298  0.177933
1   0.748203  0.995165  0.136659
2   0.422413  0.222377  0.395179
..       ...       ...       ...
7   0.727323  0.291441  0.056998
8   0.258166  0.219025  0.405838
9   0.839979  0.923173  0.431298

[10 rows x 3 columns]
Answered By: Timeless
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.