create a matrix with football player
Question:
I have two dataframes
One is called player and contains name of football players
player= ["David Gonzalez","Agustin Martinez","Jibrail Al-Hindi","Edward Cahill","Simon Becker","Paolo Imperiali","Amir Bahari","Guilherme Souza"]
player = pd.DataFrame(player)
I have another dataframe called football
id
scorer
1
David Gonzalez, Edward Cahill
2
Agustin Martinez,Brian McNamara
3
Agustin Martinez, Jibrail Al-Hindi
4
Edward Cahill,Guilherme Souza
5
Paolo Imperiali, Yannick Wagner
6
Simon Becker,Amir Bahari
7
Paolo Imperiali,Yannick Wagner
8
Amir Bahari,Guilherme Souza,David Gonzalez
9
Edward Cahill,Amir Bahari
10
Simon Becker
11
Amir Bahari
12
Paolo Imperiali,Simon Becker
13
Edward Cahill,Guilherme Souza
14
Edward Cahill,Amir Bahari
15
Simon Becker
16
Simon Becker
the second dataframe called football shows, which players scored in which game.
Now I would like to create a matrix, which shows rows and columns of all players from dataframe player, with 1 if there is a game id were both have scored together, and 0 if they don’t have a game which they scored together.
I did this.
np.zeros((player,scorer)
But I think I am in the wrong path, because I want a matrix which the columns and rows give the names of the player in player and have 1 or 0 as numbers
Answers:
You can split
/explode
and join
the players for a crosstab
:
s = football['scorer'].str.split(',s*').explode().loc[lambda s: s.isin(player[0])]
df2 = s.rename('row').to_frame().join(s.rename('col'))
out = pd.crosstab(df2['row'], df2['col']).rename_axis(index=None, columns=None)
NB. you get the number of goals in common, if you just want 0/1, add .clip(upper=1)
.
Output:
Agustin Martinez Amir Bahari David Gonzalez Edward Cahill Guilherme Souza Jibrail Al-Hindi Paolo Imperiali Simon Becker
Agustin Martinez 2 0 0 0 0 1 0 0
Amir Bahari 0 5 1 2 1 0 0 1
David Gonzalez 0 1 2 1 1 0 0 0
Edward Cahill 0 2 1 5 2 0 0 0
Guilherme Souza 0 1 1 2 3 0 0 0
Jibrail Al-Hindi 1 0 0 0 0 1 0 0
Paolo Imperiali 0 0 0 0 0 0 3 1
Simon Becker 0 1 0 0 0 0 1 5
I have two dataframes
One is called player and contains name of football players
player= ["David Gonzalez","Agustin Martinez","Jibrail Al-Hindi","Edward Cahill","Simon Becker","Paolo Imperiali","Amir Bahari","Guilherme Souza"]
player = pd.DataFrame(player)
I have another dataframe called football
id | scorer |
---|---|
1 | David Gonzalez, Edward Cahill |
2 | Agustin Martinez,Brian McNamara |
3 | Agustin Martinez, Jibrail Al-Hindi |
4 | Edward Cahill,Guilherme Souza |
5 | Paolo Imperiali, Yannick Wagner |
6 | Simon Becker,Amir Bahari |
7 | Paolo Imperiali,Yannick Wagner |
8 | Amir Bahari,Guilherme Souza,David Gonzalez |
9 | Edward Cahill,Amir Bahari |
10 | Simon Becker |
11 | Amir Bahari |
12 | Paolo Imperiali,Simon Becker |
13 | Edward Cahill,Guilherme Souza |
14 | Edward Cahill,Amir Bahari |
15 | Simon Becker |
16 | Simon Becker |
the second dataframe called football shows, which players scored in which game.
Now I would like to create a matrix, which shows rows and columns of all players from dataframe player, with 1 if there is a game id were both have scored together, and 0 if they don’t have a game which they scored together.
I did this.
np.zeros((player,scorer)
But I think I am in the wrong path, because I want a matrix which the columns and rows give the names of the player in player and have 1 or 0 as numbers
You can split
/explode
and join
the players for a crosstab
:
s = football['scorer'].str.split(',s*').explode().loc[lambda s: s.isin(player[0])]
df2 = s.rename('row').to_frame().join(s.rename('col'))
out = pd.crosstab(df2['row'], df2['col']).rename_axis(index=None, columns=None)
NB. you get the number of goals in common, if you just want 0/1, add .clip(upper=1)
.
Output:
Agustin Martinez Amir Bahari David Gonzalez Edward Cahill Guilherme Souza Jibrail Al-Hindi Paolo Imperiali Simon Becker
Agustin Martinez 2 0 0 0 0 1 0 0
Amir Bahari 0 5 1 2 1 0 0 1
David Gonzalez 0 1 2 1 1 0 0 0
Edward Cahill 0 2 1 5 2 0 0 0
Guilherme Souza 0 1 1 2 3 0 0 0
Jibrail Al-Hindi 1 0 0 0 0 1 0 0
Paolo Imperiali 0 0 0 0 0 0 3 1
Simon Becker 0 1 0 0 0 0 1 5