Variable combinations of column designations in pandas

Question

I can best explain my problem by starting with an example:

df = pd.DataFrame({"ID" : [1, 2, 3, 4], 
                  "age": [46, 48, 55, 55],
                  "gender": ['female', 'female', 'male', 'male'],
                  "overweight": ['y', 'n', 'y', 'y']},
                  index = [0, 1, 2, 3])

Now I want to build a function that recives a dataframe (= df) and an integer (= m).
For example m = 2, now the function should combine every column designations in pairs of two. The output should be a list containing those pairs. For example m=2 und df:
[[ID, age],[ID, gender],[ID, overweight],[age, gender], [age, overweight], [gender, overweight]]

Does anyone knwo how I can achieve that?
My problem is that m and the amount of columns are variable.

Asked By: peter.bucher

||

Source

Answer 1

You can use itertools.combinations directly on the dataframe as iteration occurs on the column names:

from itertools import combinations

m = 2
out = list(combinations(df, m))

output:

[('ID', 'age'),
 ('ID', 'gender'),
 ('ID', 'overweight'),
 ('age', 'gender'),
 ('age', 'overweight'),
 ('gender', 'overweight')]

Answered By: mozway

Answer 2

from itertools import combinations

n=2

[df[list(i)] for i in combinations(df.columns,n)]


[   ID  age
 0   1   46
 1   2   48
 2   3   55
 3   4   55,
    ID  gender
 0   1  female
 1   2  female
 2   3    male
 3   4    male,
    ID overweight
 0   1          y
 1   2          n
 2   3          y
 3   4          y,
    age  gender
 0   46  female
 1   48  female
 2   55    male
 3   55    male,
    age overweight
 0   46          y
 1   48          n
 2   55          y
 3   55          y,
    gender overweight
 0  female          y
 1  female          n
 2    male          y
 3    male          y]

Answered By: G.G

Variable combinations of column designations in pandas

Question:

Answers: