How to cross join (cartesian product) two Series?
Question:
Consider the following two series.
Let x
be:
x
a 10
b 20
c 30
Name: x_value
And let y
be:
y
d 100
e 200
Name: y_value
Ideally, the result would have a MultiIndex along with the cartesian product of the series’ cross values:
x_value y_value
x y
a d 10 100
e 10 200
b d 20 100
e 20 200
c d 30 100
e 30 200
I have seen similar questions (e.g. cartesian product in pandas) about cross merge, but I haven’t found anything about Series so far (let alone a MultiIndex of initial indices approach).
The part that seems troublesome to me is how I’d get to work with Series, instead of DataFrames.
Answers:
pd.merge()
works on Series, but it doesn’t keep the index.
df = pd.merge(x, y, how='cross')
df
x_value y_value
0 10 100
1 10 200
2 20 100
3 20 200
4 30 100
5 30 200
You can just make the MultiIndex yourself:
df.index = pd.MultiIndex.from_product([x.index, y.index], names=['x', 'y'])
df
x_value y_value
x y
a d 10 100
e 10 200
b d 20 100
e 20 200
c d 30 100
e 30 200
Another solution is to use reset_index
and merge
, then set_index
multiindex:
s1 = pd.Series([10,20,30], index=[*'abc'], name='x_values')
s2 = pd.Series([100, 200], index=[*'de'], name='y_values')
s1.reset_index().merge(s2.reset_index(), how='cross').set_index(['index_x', 'index_y'])
Output:
x_values y_values
index_x index_y
a d 10 100
e 10 200
b d 20 100
e 20 200
c d 30 100
e 30 200
Consider the following two series.
Let x
be:
x
a 10
b 20
c 30
Name: x_value
And let y
be:
y
d 100
e 200
Name: y_value
Ideally, the result would have a MultiIndex along with the cartesian product of the series’ cross values:
x_value y_value
x y
a d 10 100
e 10 200
b d 20 100
e 20 200
c d 30 100
e 30 200
I have seen similar questions (e.g. cartesian product in pandas) about cross merge, but I haven’t found anything about Series so far (let alone a MultiIndex of initial indices approach).
The part that seems troublesome to me is how I’d get to work with Series, instead of DataFrames.
pd.merge()
works on Series, but it doesn’t keep the index.
df = pd.merge(x, y, how='cross')
df
x_value y_value
0 10 100
1 10 200
2 20 100
3 20 200
4 30 100
5 30 200
You can just make the MultiIndex yourself:
df.index = pd.MultiIndex.from_product([x.index, y.index], names=['x', 'y'])
df
x_value y_value
x y
a d 10 100
e 10 200
b d 20 100
e 20 200
c d 30 100
e 30 200
Another solution is to use reset_index
and merge
, then set_index
multiindex:
s1 = pd.Series([10,20,30], index=[*'abc'], name='x_values')
s2 = pd.Series([100, 200], index=[*'de'], name='y_values')
s1.reset_index().merge(s2.reset_index(), how='cross').set_index(['index_x', 'index_y'])
Output:
x_values y_values
index_x index_y
a d 10 100
e 10 200
b d 20 100
e 20 200
c d 30 100
e 30 200