NetworkX graph with some specifications based on two dataframes
Question:
I have two dataframes. The first shows the name of people of a program, called df_student
.
Student-ID
Name
20202456
Luke De Paul
20202713
Emil Smith
20202456
Alexander Müller
20202713
Paul Bernard
20202456
Zoe Michailidis
20202713
Joanna Grimaldi
20202456
Kepler Santos
20202713
Dominic Borg
20202456
Jessica Murphy
20202713
Danielle Dominguez
And the other shows a dataframe where people reach the best grades with at least one person from the df_student
in a course and is called df_course
.
Course-ID
Name
Grade
UNI44
Luke De Paul, Benjamin Harper
17
UNI45
Dominic Borg
20
UNI61
Luke De Paul, Jonathan MacAllister
20
UNI62
Alexander Müller, Kepler Santos
17
UNI63
Joanna Grimaldi
19
UNI65
Emil Smith, Filippo Visconti
18
UNI71
Moshe Azerad, Emil Smith
18
UNI72
Luke De Paul, Jessica Murphy
18
UNI73
Luke De Paul, Filippo Visconti
17
UNI74
Matthias Noem, Kepler Santos
19
UNI75
Luke De Paul, Kepler Santos
16
UNI76
Kepler Santos
17
UNI77
Kepler Santos, Benjamin Harper
17
UNI78
Dominic Borg, Kepler Santos
18
UNI80
Luke De Paul, Gabriel Martin
18
UNI81
Dominic Borg, Alexander Müller
19
UNI82
Luke De Paul, Giancarlo Di Lorenzo
20
UNI83
Emil Smith,Joanna Grimaldi
20
I would like to create a NetworkX
graph where there is a vertex for each student from df_student
and also from each student from df_course
. There should also be an unweighted each between two vertices only if two student received the best grade in the same course.
Now what I tried is this
import networkx as nx
G = nx.Graph()
G.add_edge(student, course)
But when I doing is it say that argument is not right. And so I don’t know how to continue
Answers:
Try:
import networkx as nx
import pandas as pd
df_students = pd.read_clipboard()
df_course = pd.read_clipboard()
df_s_t = df_course['Name'].str.split(',', expand=True)
G = nx.from_pandas_edgelist(df_net, 0, 1)
df_net = df_s_t[df_s_t.notna().all(1)]
G.add_nodes_from(pd.concat([df_students['Name'],
df_s_t.loc[~df_s_t.notna().all(1),0]]))
fig, ax = plt.subplots(1,1, figsize=(15,15))
nx.draw_networkx(G)
Output:
I have two dataframes. The first shows the name of people of a program, called df_student
.
Student-ID | Name |
---|---|
20202456 | Luke De Paul |
20202713 | Emil Smith |
20202456 | Alexander Müller |
20202713 | Paul Bernard |
20202456 | Zoe Michailidis |
20202713 | Joanna Grimaldi |
20202456 | Kepler Santos |
20202713 | Dominic Borg |
20202456 | Jessica Murphy |
20202713 | Danielle Dominguez |
And the other shows a dataframe where people reach the best grades with at least one person from the df_student
in a course and is called df_course
.
Course-ID | Name | Grade |
---|---|---|
UNI44 | Luke De Paul, Benjamin Harper | 17 |
UNI45 | Dominic Borg | 20 |
UNI61 | Luke De Paul, Jonathan MacAllister | 20 |
UNI62 | Alexander Müller, Kepler Santos | 17 |
UNI63 | Joanna Grimaldi | 19 |
UNI65 | Emil Smith, Filippo Visconti | 18 |
UNI71 | Moshe Azerad, Emil Smith | 18 |
UNI72 | Luke De Paul, Jessica Murphy | 18 |
UNI73 | Luke De Paul, Filippo Visconti | 17 |
UNI74 | Matthias Noem, Kepler Santos | 19 |
UNI75 | Luke De Paul, Kepler Santos | 16 |
UNI76 | Kepler Santos | 17 |
UNI77 | Kepler Santos, Benjamin Harper | 17 |
UNI78 | Dominic Borg, Kepler Santos | 18 |
UNI80 | Luke De Paul, Gabriel Martin | 18 |
UNI81 | Dominic Borg, Alexander Müller | 19 |
UNI82 | Luke De Paul, Giancarlo Di Lorenzo | 20 |
UNI83 | Emil Smith,Joanna Grimaldi | 20 |
I would like to create a NetworkX
graph where there is a vertex for each student from df_student
and also from each student from df_course
. There should also be an unweighted each between two vertices only if two student received the best grade in the same course.
Now what I tried is this
import networkx as nx
G = nx.Graph()
G.add_edge(student, course)
But when I doing is it say that argument is not right. And so I don’t know how to continue
Try:
import networkx as nx
import pandas as pd
df_students = pd.read_clipboard()
df_course = pd.read_clipboard()
df_s_t = df_course['Name'].str.split(',', expand=True)
G = nx.from_pandas_edgelist(df_net, 0, 1)
df_net = df_s_t[df_s_t.notna().all(1)]
G.add_nodes_from(pd.concat([df_students['Name'],
df_s_t.loc[~df_s_t.notna().all(1),0]]))
fig, ax = plt.subplots(1,1, figsize=(15,15))
nx.draw_networkx(G)
Output: