Retrieve or store results of Gremlin queries within AWS Neptune ML, in IPython Notebook

Question:

I have a graph database stored on AWS Neptune, which I need to query with gremlin within a Jupyter IPython notebook. I am applying the graph-neural-networks functionalities offered by Neptune ML for a link prediction task. Specifically, I want to predict which nodes of "TYPE_X" are related to the ones saved in my variable "id_variable".

My query looks like this:

%%gremlin
g.with("Neptune#ml.endpoint","${endpoint}").
    V(${id_variable}).
    project('name', 'related to').
        by('name').
        by( out('RELATED_TO').with("Neptune#ml.prediction").
            hasLabel('TYPE_X').values('name') ).
    order(local).by(keys, desc)

which returns the following output:

{'name': 'AANAT', 'related to': 'WDR7'}
{'name': 'ACACA', 'related to': 'BTN1A1'}
{'name': 'ACTA1', 'related to': 'MDH'}
{'name': 'ALAS1', 'related to': 'WDR7'}
{'name': 'ALAS2', 'related to': 'TAC3'}
{'name': 'ALDH2', 'related to': 'SOCS2'}
{'name': 'ALDOA', 'related to': 'PRKAB2'}
{'name': 'AKR1B1', 'related to': 'ODF2L'}
{'name': 'ALOX15', 'related to': 'BMP15'}

My problem is that this output is showed as embedded in the output of the notebook cell; however, I would like either to assign it to a variable or store it into a file, as a JSON for instance. In fact, I cannot do variable assignment with the %%gremlin cell magic, and so far I have not found any way to write the output to a file.

Please note that I was not able to run this query in a normal .py script by means of the gremlin_python library, as it does not seem to support the ML functionalities of Neptune (specifically, it throws an error on the .with("Neptune#ml.endpoint","${endpoint}") syntax).

Any suggestion is more than welcome!

Thank you in advance.

Asked By: Luca P.

||

Answers:

Have you tried using
–store-to (or -s) param – Specifies the name of a variable in which to store the query results. ?

%%gremlin --store-to results
g.with("Neptune#ml.endpoint","${endpoint}").
    V(${id_variable}).
    project('name', 'related to').
        by('name').
        by( out('RELATED_TO').with("Neptune#ml.prediction").
            hasLabel('TYPE_X').values('name') ).
    order(local).by(keys, desc)

and check the results variable in next cell

Answered By: Rohit Kumar