Django: save() vs update() to update the database?

Question:

I’m writing a Django app, and I need a function to update a field in the database. Is there any reason to do one of these methods rather than the other?

def save_db_field(name,field,value):
    obj = MyModel.objects.get(name=name)
    obj.field = value
    obj.save()

def update_db_field(name,field,value):
    MyModel.objects.get(name=name).update(field=value)

It seems like the second is better because it does it in one DB call instead of two. Is there a reason why fetching, then updating is any better?

Asked By: zoidberg

||

Answers:

save() method can be used to insert new record and update existing record and generally used for saving instance of single record(row in mysql) in database.

update() is not used to insert records and can be used to update multiple records(rows in mysql) in database.

Answered By: Roy

Update only works on updating querysets. If you want to update multiple fields at the same time, say from a dict for a single object instance you can do something like:

obj.__dict__.update(your_dict)
obj.save()

Bear in mind that your dictionary will have to contain the correct mapping where the keys need to be your field names and the values the values you want to insert.

Answered By: chaos

Both looks similar, but there are some key points:

  1. save() will trigger any overridden Model.save() method, but update() will not trigger this and make a direct update on the database level. So if you have some models with overridden save methods, you must either avoid using update or find another way to do whatever you are doing on that overridden save() methods.

  2. obj.save() may have some side effects if you are not careful. You retrieve the object with get(...) and all model field values are passed to your obj. When you call obj.save(), django will save the current object state to record. So if some changes happens between get() and save() by some other process, then those changes will be lost. use save(update_fields=[.....]) for avoiding such problems.

  3. Before Django version 1.5, Django was executing a SELECT before INSERT/UPDATE, so it costs 2 query execution. With version 1.5, that method is deprecated.

In here, there is a good guide or save() and update() methods and how they are executed.

Answered By: FallenAngel

There are several key differences.

update is used on a queryset, so it is possible to update multiple objects at once.

As @FallenAngel pointed out, there are differences in how custom save() method triggers, but it is also important to keep in mind signals and ModelManagers. I have build a small testing app to show some valuable differencies. I am using Python 2.7.5, Django==1.7.7 and SQLite, note that the final SQLs may vary on different versions of Django and different database engines.

Ok, here’s the example code.

models.py:

from __future__ import print_function
from django.db import models
from django.db.models import signals
from django.db.models.signals import pre_save, post_save
from django.dispatch import receiver

__author__ = 'sobolevn'

class CustomManager(models.Manager):
    def get_queryset(self):
        super_query = super(models.Manager, self).get_queryset()
        print('Manager is called', super_query)
        return super_query


class ExtraObject(models.Model):
    name = models.CharField(max_length=30)

    def __unicode__(self):
        return self.name


class TestModel(models.Model):

    name = models.CharField(max_length=30)
    key = models.ForeignKey('ExtraObject')
    many = models.ManyToManyField('ExtraObject', related_name='extras')

    objects = CustomManager()

    def save(self, *args, **kwargs):
        print('save() is called.')
        super(TestModel, self).save(*args, **kwargs)

    def __unicode__(self):
        # Never do such things (access by foreing key) in real life,
        # because it hits the database.
        return u'{} {} {}'.format(self.name, self.key.name, self.many.count())


@receiver(pre_save, sender=TestModel)
@receiver(post_save, sender=TestModel)
def reicever(*args, **kwargs):
    print('signal dispatched')

views.py:

def index(request):
    if request and request.method == 'GET':

        from models import ExtraObject, TestModel

        # Create exmple data if table is empty:
        if TestModel.objects.count() == 0:
            for i in range(15):
                extra = ExtraObject.objects.create(name=str(i))
                test = TestModel.objects.create(key=extra, name='test_%d' % i)
                test.many.add(test)
                print test

        to_edit = TestModel.objects.get(id=1)
        to_edit.name = 'edited_test'
        to_edit.key = ExtraObject.objects.create(name='new_for')
        to_edit.save()

        new_key = ExtraObject.objects.create(name='new_for_update')
        to_update = TestModel.objects.filter(id=2).update(name='updated_name', key=new_key)
        # return any kind of HttpResponse

That resuled in these SQL queries:

# to_edit = TestModel.objects.get(id=1):
QUERY = u'SELECT "main_testmodel"."id", "main_testmodel"."name", "main_testmodel"."key_id" 
FROM "main_testmodel" 
WHERE "main_testmodel"."id" = %s LIMIT 21' 
- PARAMS = (u'1',)

# to_edit.save():
QUERY = u'UPDATE "main_testmodel" SET "name" = %s, "key_id" = %s 
WHERE "main_testmodel"."id" = %s' 
- PARAMS = (u"'edited_test'", u'2', u'1')

# to_update = TestModel.objects.filter(id=2).update(name='updated_name', key=new_key):
QUERY = u'UPDATE "main_testmodel" SET "name" = %s, "key_id" = %s 
WHERE "main_testmodel"."id" = %s' 
- PARAMS = (u"'updated_name'", u'3', u'2')

We have just one query for update() and two for save().

Next, lets talk about overriding save() method. It is called only once for save() method obviously. It is worth mentioning, that .objects.create() also calls save() method.

But update() does not call save() on models. And if no save() method is called for update(), so the signals are not triggered either. Output:

Starting development server at http://127.0.0.1:8000/
Quit the server with CONTROL-C.

# TestModel.objects.get(id=1):
Manager is called [<TestModel: edited_test new_for 0>]
Manager is called [<TestModel: edited_test new_for 0>]
save() is called.
signal dispatched
signal dispatched

# to_update = TestModel.objects.filter(id=2).update(name='updated_name', key=new_key):
Manager is called [<TestModel: edited_test new_for 0>]

As you can see save() triggers Manager‘s get_queryset() twice. When update() only once.

Resolution. If you need to “silently” update your values, without save() been called – use update. Usecases: last_seen user’s field. When you need to update your model properly use save().

Answered By: sobolevn

Update will give you better performance with a queryset of more than one object, as it will make one database call per queryset.

However save is useful, as it is easy to override the save method in your model and add extra logic there. In my own application for example, I update a dates when other fields are changed.

Class myModel(models.Model): 
    name = models.CharField()
    date_created = models.DateField()

    def save(self):
        if not self.pk :
           ### we have a newly created object, as the db id is not set
           self.date_created = datetime.datetime.now()
        super(myModel , self).save()
Answered By: wobbily_col

Using update directly is more efficient and could also prevent integrity problems.

From the official documentation https://docs.djangoproject.com/en/3.0/ref/models/querysets/#django.db.models.query.QuerySet.update

If you’re just updating a record and don’t need to do anything with
the model object, the most efficient approach is to call update(),
rather than loading the model object into memory. For example, instead
of doing this:

e = Entry.objects.get(id=10)
e.comments_on = False
e.save()

…do this:

Entry.objects.filter(id=10).update(comments_on=False)

Using update() also prevents a race condition wherein something might
change in your database in the short period of time between loading
the object and calling save().

Answered By: villamejia

Use _state.adding to differentiate update from create https://docs.djangoproject.com/en/3.2/ref/models/instances/

def save(self, *args, **kwargs):
    # Check how the current values differ from ._loaded_values. For example,
    # prevent changing the creator_id of the model. (This example doesn't
    # support cases where 'creator_id' is deferred).
    if not self._state.adding and (
            self.creator_id != self._loaded_values['creator_id']):
        raise ValueError("Updating the value of creator isn't allowed")
    super().save(*args, **kwargs)
Answered By: David

save():

  • can be used with model object but not with QuerySet or Manager objects.
  • with select_for_update() can run SELECT FOR UPDATE query.

update()

  • can be used with QuerySet or Manager objects but not with model object.
  • with select_for_update() cannot run SELECT FOR UPDATE query.

For example, I have Person model as shown below:

# "store/models.py"

from django.db import models

class Person(models.Model):
    name = models.CharField(max_length=30)

    def __str__(self):
        return self.name

Then, you can use save() with Person model object as shown below:

# "store/views.py"

from .models import Person
from django.http import HttpResponse

def test(request):
    person = Person.objects.get(id=1)
    person.name = 'Tom'
    person.save() # Here

    return HttpResponse("Test")
# "store/views.py"

from .models import Person
from django.http import HttpResponse

def test(request):
    person = Person.objects.filter(pk=1).first()
    person.name = 'Tom'
    person.save() # Here

    return HttpResponse("Test")

But, you cannot use save() with QuerySet object as shown below:

# "store/views.py"

from .models import Person
from django.http import HttpResponse

def test(request):
    person = Person.objects.filter(pk=1)
    person.name = 'Tom'
    person.save() # Here

    return HttpResponse("Test")

Then, the error below occurs:

AttributeError: ‘QuerySet’ object has no attribute ‘save’

And, you cannot use save() with Manager object as shown below:

# "store/views.py"

from .models import Person
from django.http import HttpResponse

def test(request):
    person = Person.objects
    person.name = 'Tom'
    person.save() # Here

    return HttpResponse("Test")

Then, the error below occurs:

AttributeError: ‘Manager’ object has no attribute ‘save’

Then, you can use update() with QuerySet object as shown below:

# "store/views.py"

from .models import Person
from django.http import HttpResponse

def test(request):
    person = Person.objects.filter(pk=1)
    person.update(name="Tom") # Here

    return HttpResponse("Test")

And, you can use update() with Manager object as shown below:

# "store/views.py"

from .models import Person
from django.http import HttpResponse

def test(request):
    person = Person.objects
    person.update(name="Tom") # Here

    return HttpResponse("Test")

But, you cannot use update() with Person model object as shown below:

# "store/views.py"

from .models import Person
from django.http import HttpResponse

def test(request):
    person = Person.objects.get(id=1)
    person.update(name="Tom") # Here

    return HttpResponse("Test")
# "store/views.py"

from .models import Person
from django.http import HttpResponse

def test(request):
    person = Person.objects.filter(pk=1).first()    
    person.update(name="Tom") # Here

    return HttpResponse("Test")

Then, the error below occurs:

AttributeError: ‘Person’ object has no attribute ‘update’

Next for example, select_for_update() is used to prevent race condition(lost update or write skew) when updating data in Django.

And, I have test view with save() and select_for_update().filter(pk=1).first() as shown below:

# "store/views.py"

from django.db import transaction
from .models import Person
from django.http import HttpResponse

@transaction.atomic
def test(request):
    person = Person.objects.select_for_update().filter(pk=1).first() # Here
    person.name = 'Tom'
    person.save() # Here

    return HttpResponse("Test")

Then, when I run test view, SELECT FOR UPDATE and UPDATE queries are run as shown below. *I used PostgreSQL and these logs below are the query logs of PostgreSQL and you can check On PostgreSQL, how to log SQL queries with transaction queries such as "BEGIN" and "COMMIT":

enter image description here

Now, I remove first() to use update() as shown below:

# "store/views.py"

from django.db import transaction
from .models import Person
from django.http import HttpResponse

@transaction.atomic
def test(request):
    person = Person.objects.select_for_update().filter(pk=1) # Here
    person.update(name="Tom") # Here

    return HttpResponse("Test")

Then, when I run test view, SELECT FOR UPDATE query is not run and only UPDATE query is run as shown below:

enter image description here

And, I have test view with save() and select_for_update().get(pk=1) as shown below:

# "store/views.py"

from django.db import transaction
from .models import Person
from django.http import HttpResponse

@transaction.atomic
def test(request):
    person = Person.objects.select_for_update().get(pk=1) # Here
    person.name = 'Tom'
    person.save() # Here

    return HttpResponse("Test")

Then, when I run test view, SELECT FOR UPDATE and UPDATE queries are run as shown below:

enter image description here

Now, I remove get() to use update() as shown below:

# "store/views.py"

from django.db import transaction
from .models import Person
from django.http import HttpResponse

@transaction.atomic
def test(request):
    person = Person.objects.select_for_update() # Here
    person.update(name="Tom") # Here

    return HttpResponse("Test")

Then, when I run test view, SELECT FOR UPDATE query is not run and only UPDATE query is run as shown below:

enter image description here

So, save() with select_for_update() can run SELECT FOR UPDATE query while update() with select_for_update() cannot.

Answered By: Kai – Kazuya Ito

one of the differences that might cause a lot of headaches is that save updates, but update does NOT update columns of type DateTimeField(auto_now=True)
or ModificationDateTimeField
These fields automatically (should) set their date when an object is saved to the database.

Answered By: Sinisa Rudan