Can Memgraph use disk as a storage?

Question:

Memgraph is an in-memory graph database. I have a large database with nodes and relationships with metadata that doesn’t need to be used in any graph algorithms that need to be carried out in Memgraph. Is there a way to store that data on the disk rather than in memory?

Asked By: KWriter

||

Answers:

Since Memgraph is a graph database that stores data only in memory, the GQLAlchemy library provides an on-disk storage solution for large properties not used in graph algorithms.

First, you need to do all necessary imports and connect to the running Memgraph and SQL database instance:

from gqlalchemy import Memgraph, SQLitePropertyDatabase, Node, Field
from typing import Optional

graphdb = Memgraph()
SQLitePropertyDatabase('path-to-my-db.db', graphdb)

The graphdb creates a connection to an in-memory graph database and SQLitePropertyDatabase attaches to graphdb in its constructor.

Next, you need to define schema.

For example, you can create the class User which maps to a node object in the graph database.

class User(Node):
    id: int = Field(unique=True, exists=True, index=True, db=graphdb)
    huge_string: Optional[str] = Field(on_disk=True)

Here the property id is a required int that creates uniqueness and existence constraints inside Memgraph. You can notice that the property id is also indexed on label User. The huge_string property is optional, and because the on_disk argument is set to True, it will be saved into the SQLite database.

In the last step, you can create some huge string, which won’t be saved into the graph database, but rather into the SQLite database.

my_secret = "I LOVE DUCKS" * 1000
john = User(id=5, huge_string=my_secret).save(db)
john2 = User(id=5).load(db)
print(john2.huge_string)  # prints I LOVE DUCKS, a 1000 times

Some additional detail can be found in the blog post Using on disk storage with an in-memory Graph Database.

Answered By: KWriter