Pickle only dumps one value in loop

Question:

I am trying to pickle certain data so I have an easier time retrieving it. My code looks like this:

import pickle

import networkx as nx
import pandas as pd
import numpy as np

import load_data as load


# load the graph
g = load.local_data()

for node in g.nodes():

    # get node degree
    pickle.dump(g.degree(node), open("./pickles/degree.pickle", "wb"))
    # get in-degree of node
    pickle.dump(g.in_degree(node), open("./pickles/indegrees.pickle", "wb"))
    # get out-degree of node
    pickle.dump(g.out_degree(node), open("./pickles/outdegrees.pickle", "wb"))
    # get clustering coefficients of node
    pickle.dump(nx.clustering(g, node), open("./pickles/clustering.pickle", "wb"))

I have tried printing the commands and they deliver a full list of all nodes and their attributes. However when I open the pickled file it has only stored one single integer. Does anyone know why that may be?

Asked By: luthien aerendell

||

Answers:

In the current code, you are asking python to open and write something to the pickle as you iterate over the nodes. This ends up overwriting what was already stored in the pickle file every iteration.

What you might want to do instead is:

with open("./pickles/degree.pickle", "wb") as f:
    obj = [g.degree(node) for node in g]
    pickle.dump(obj, f)

...

Depending on your ultimate objective, it might be better to store results in a csv or some other format that can be shared safely between computers. (perhaps using one of the formats that pandas supports)

edit: to store a dictionary with node: degree values, use dictionary comprehension:

with open("./pickles/degree.pickle", "wb") as f:
    obj = {node: g.degree(node) for node in g}
    pickle.dump(obj, f)
Answered By: SultanOrazbayev