Printing the response of a RethinkDB query in a reasonable way

Question:

I am participating in the Yelp Dataset Challenge and I’m using RethinkDB to store the JSON documents for each of the different datasets.

I have the following script:

import rethinkdb as r
import json, os

RDB_HOST =  os.environ.get('RDB_HOST') or 'localhost'
RDB_PORT = os.environ.get('RDB_PORT') or 28015
DB = 'test'

connection = r.connect(host=RDB_HOST, port=RDB_PORT, db=DB)

query = r.table('yelp_user').filter({"name":"Arthur"}).run(connection)
print(query)

But when I run it at the terminal in a virtualenv I get this as an example response:

<rethinkdb.net.DefaultCursor object at 0x102c22250> (streaming):
[{'yelping_since': '2014-03', 'votes': {'cool': 1, 'useful': 2, 'funny': 1}, 'review_count': 5, 'id': '08eb0b0d-2633-4ec4-93fe-817a496d4b52', 'user_id': 'ZuDUSyT4bE6sx-1MzYd2Kg', 'compliments': {}, 'friends': [], 'average_stars': 5, 'type': 'user', 'elite': [], 'name': 'Arthur', 'fans': 0}, ...]

I know I can use pprint to pretty print outputs but a bigger issue that I don’t understand how to resolve is just printing them in an intelligent manner, like not just showing “…” as the end of the output.

Any suggestions?

Asked By: Arthur Collé

||

Answers:

run returns an iterable cursor. Iterate over it to get all the rows:

query = r.table('yelp_user').filter({"name":"Arthur"})
for row in query.run(connection):
    print(row)
Answered By: Etienne Laurin

Another way is to convert rethinkdb.net.DefaultCursor (or Cursor) into a pandas DataFrame

As seen on documentation (https://rethinkdb.com/api/python/to_array), the Cursor can be transformed into a list, and then to a DataFrame by simply calling:

pd.DataFrame(list(r.db('YOUR-DB').table('YOUR-TABLE').run()))

Although it breaks some of NO-SQL DB logic, since pandas is basead on structured data, it is still a good way to vizualize data

Answered By: guilistocco
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.