How to represent the data of an excel file into a directed graph? Python

Question:

I have downloaded California road network dataset from Stanford Network Analysis Project. The data is a text file which can be converted to an excel file with two columns. The first is for the start nodes, and the second column is for the end nodes.

# FromNodeId    ToNodeId
       0           1
       0           2
       0          469
       1           0
       1           6
       1          385
       2           0
       2           3
      469          0
      469         380
      469        37415
       6           1
       6           5
      385          1
      385         384
      385         386
       3           2
       3           4
       3          419
       3          422

Now, how can I want to convert this data into an adjacency matrix or any other object for analysing the graph? For example for calculating degree distribution, clustering coefficients, etc.
I will be grateful for any help on how to represent this data into a graph using python and related libraries.

Asked By: Nariman Masjedi

||

Answers:

It seems like your’re looking for degree_distrubition and average_clustering in :

#pip install networkx
import networkx as nx

G = nx.read_edgelist("roadNet-CA.txt", nodetype=int, create_using=nx.DiGraph())

degree_distribution = nx.degree_histogram(G)
#[0, 8, 0, 1, 3, 1, 2]
clustering_coefficient = nx.average_clustering(G)
#0.0

enter image description here

Answered By: Timeless

To represent the data of an Excel file into a directed graph in Python, you can use the pandas and networkx libraries. Here are the steps:

Read the Excel file using pandas:

import pandas as pd

df = pd.read_excel('filename.xlsx')

Create a directed graph using networkx:

import networkx as nx

G = nx.DiGraph()
Iterate over the rows of the dataframe and add nodes and edges to the graph:


for index, row in df.iterrows():
    source_node = row['Source']
    target_node = row['Target']
    weight = row['Weight']
    G.add_edge(source_node, target_node, weight=weight)

In this example, we assume that the Excel file has three columns: Source, Target, and Weight. Source and Target are the names of two nodes and Weight represents the weight/edge cost between them.

You can then visualize the graph using matplotlib or any other library of your choice:

import matplotlib.pyplot as plt

pos = nx.spring_layout(G)

nx.draw_networkx_nodes(G, pos, node_size=500)
nx.draw_networkx_edges(G, pos)
nx.draw_networkx_labels(G, pos)

plt.show()

This code will create a directed graph using the data in the Excel file and visualize it on the screen.

Answered By: Mojtaba Farmani
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.