How to add new edges to the stellargraph dataset?

Question:

I need to add some extra edges to Cora dataset using stellargraph. Is there ane way to add edges to the current dataset in stellargraph library?

import stellargraph as sg
dataset = sg.datasets.Cora()

For example in NetworkX, we can add some edges to the existing graph using add_edges_from(edgelist).

Asked By: csperson

||

Answers:

You can’t do it directly from stellargraph since version 0.9.
You’ll have to use .to_networkx() to convert it back to networkX format, add you edges and then convert it back to stellargraph.

from stellargraph import StellarGraph
import networkx as nx

g = StellarGraph.to_networkx(dataset)
g.add_edges_from(edgelist)
new_dataset = StellarGraph.from_networkx(g)
Answered By: Reine Baudache

I recently ran into a similar scenario and converting back-and-forth to networkx format was not possible. Specifically, since StellarGraph is supposed to be capable of storing graphs much larger than networkx, there will be a point at which converting would not be possible.

To get around this, I used the numpy loading capabilities of StellarGraph 1.2.1.

With StellarGraph, you can dump the edge array with edge_arrays() into pandas, then concatenate any desired edges onto that. It is much lighter memory-wise, since pandas and StellarGraph both perform better than networkx.

Here is a short example:

import pandas as pd
from stellargraph import IndexedArray, StellarGraph

#### original data / graph

nodes = IndexedArray(index=['a', 'b', 'c', 'd'])
original_edges = pd.DataFrame(
    {
        'source' : [0, 1, 2, 3, 0],
        'target' : [1, 2, 3, 0, 2]
    }
)
original_graph = StellarGraph(
    nodes, 
    original_edges
)

#### new data

new_edges = pd.DataFrame(
    {
        'source' : [3, 3],
        'target' : [1, 2]
    }
)

#### new graph

new_graph = StellarGraph(
    nodes, 
    pd.concat(
        [
            original_edges,
            new_edges
        ],
        ignore_index=True
    )
)
Answered By: Jaime Hernandez

No but this is ridiculous… adding and removing nodes are operations that anyone working with graphs are expected to do. NetworkX uses RAM extremely inefficiently…