Why is SQLite faster than Redis in this simple benchmark?

Question:

I have done simple performance test on my local machine, this is python script:

import redis
import sqlite3
import time

data = {}
N = 100000

for i in xrange(N):
    key = "key-"+str(i)
    value = "value-"+str(i)
    data[key] = value

r = redis.Redis("localhost", db=1)
s = sqlite3.connect("testDB")
cs = s.cursor()

try:
    cs.execute("CREATE TABLE testTable(key VARCHAR(256), value TEXT)")
except Exception as excp:
    print str(excp)
    cs.execute("DROP TABLE testTable")
    cs.execute("CREATE TABLE testTable(key VARCHAR(256), value TEXT)")

print "[---Testing SQLITE---]"
sts = time.time()
for key in data:
    cs.execute("INSERT INTO testTable VALUES(?,?)", (key, data[key]))
    #s.commit()
s.commit()
ste = time.time()
print "[Total time of sql: %s]"%str(ste-sts)

print "[---Testing REDIS---]"
rts = time.time()
r.flushdb()# for empty db
for key in data:
    r.set(key, data[key])
rte = time.time()
print "[Total time of redis: %s]"%str(rte-rts)

I expected redis to perform faster, but the result shows that it much more slower:

[---Testing SQLITE---]
[Total time of sql: 0.615846157074]
[---Testing REDIS---]
[Total time of redis: 10.9668009281]

So, the redis is memory based, what about sqlite? Why redis is so slow? When I need to use redis and when I need to use sqlite?

Asked By: torayeff

||

Answers:

from the redis documentation

Redis is a server: all commands involve network or IPC roundtrips. It is meaningless to compare it to embedded data stores such as SQLite, Berkeley DB, Tokyo/Kyoto Cabinet, etc … because the cost of most operations is precisely dominated by network/protocol management.

Which does make sense though it’s an acknowledgement of speed issues in certain cases. Redis might perform a lot better than sqlite under multiples of parallel access for instance.

The right tool for the right job, sometimes it’ll be redis other times sqlite other times something totally different. If this speed test is a proper showing of what your app will realistically do then sqlite will serve you better and it’s good that you did this benchmark.

Answered By: Harald Brinkhof

SQLite is very fast, and you’re only requiring one IO action (on the commit). Redis is doing significantly more IO since it’s over the network. A more apples-to-apples comparison would involve a relational database accessed over a network (like MySQL or PostgreSQL).

You should also keep in mind that SQLite has been around for a long time and is very highly optimized. It’s limited by ACID compliance, but you can actually turn that off (as some NoSQL solutions do), and get it even faster.

Answered By: Brendan Long

Just noticed that you did not pipeline the commit for redis. Using piplines the time reduces:

[—Testing SQLITE—]

[Total time of sql: 0.669369935989]

[—Testing REDIS—]

[Total time of redis: 2.39369487762]

Answered By: basti

The current answers provide insight as to why Redis loses this particular benchmark, i.e. network overhead generated by every command executed against the server, however no attempt has been made to refactor the benchmark code to accelerate Redis performance.

The problem with your code lies here:

for key in data:
    r.set(key, data[key])

You incur 100,000 round-trips to the Redis server, resulting in great I/O overhead.

This is totally unnecessary as Redis provides “batch” like functionality for certain commands, so for SET there is MSET, so you can refactor the above to:

r.mset(data)

From 100,000 server trips down to 1. You simply pass the Python dictionary as a single argument and Redis will atomically apply the update on the server.

This will make all the difference in your particular benchmark, you should see Redis perform at least on par with SQLite.

Answered By: user2014979
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.