What is the best approach to change primary keys in an existing Django app?

Question:

I have an application which is in BETA mode. The model of this app has some classes with an explicit primary_key. As a consequence Django use the fields and doesn’t create an id automatically.

class Something(models.Model):
    name = models.CharField(max_length=64, primary_key=True)

I think that it was a bad idea (see unicode error when saving an object in django admin) and I would like to move back and have an id for every class of my model.

class Something(models.Model):
    name = models.CharField(max_length=64, db_index=True)

I’ve made the changes to my model (replace every primary_key=True by db_index=True) and I want to migrate the database with south.

Unfortunately, the migration fails with the following message:
ValueError: You cannot add a null=False column without a default value.

I am evaluating the different workarounds for this problem. Any suggestions?

Thanks for your help

Asked By: luc

||

Answers:

Currently you are failing because you are adding a pk column that breaks the NOT NULL and UNIQUE requirements.

You should split the migration into several steps, separating schema migrations and data migrations:

  • add the new column, indexed but not primary key, with a default value (ddl migration)
  • migrate the data: fill the new column with the correct value (data migration)
  • mark the new column primary key, and remove the former pk column if it has become unnecessary (ddl migration)
Answered By: Tobu

Agreed, your model is probably wrong.

The formal primary key should always be a surrogate key. Never anything else. [Strong words. Been database designer since the 1980’s. Important lessoned learned is this: everything is changeable, even when the users swear on their mothers’ graves that the value cannot be changed is is truly a natural key that can be taken as primary. It isn’t primary. Only surrogates can be primary.]

You’re doing open-heart surgery. Don’t mess with schema migration. You’re replacing the schema.

  1. Unload your data into JSON files. Use Django’s own internal django-admin.py tools for this. You should create one unload file for each that will be changing and each table that depends on a key which is being created. Separate files make this slightly easier to do.

  2. Drop the tables which you are going to change from the old schema.

    Tables which depend on these tables will have their FK’s changed; you can either
    update the rows in place or — it might be simpler — to delete and reinsert
    these rows, also.

  3. Create the new schema. This will only create the tables which are changing.

  4. Write scripts to read and reload the data with the new keys. These are short and very similar. Each script will use json.load() to read objects from the source file; you will then create your schema objects from the JSON tuple-line objects that were built for you. You can then insert them into the database.

    You have two cases.

    • Tables with PK’s change changed will be inserted and will get new PK’s. These must be “cascaded” to other tables to assure that the other table’s FK’s get changed also.

    • Tables with FK’s that change will have to locate the row in the foreign table and update their FK reference.

Alternative.

  1. Rename all your old tables.

  2. Create the entire new schema.

  3. Write SQL to migrate all the data from old schema to new schema. This will have to cleverly reassign keys as it goes.

  4. Drop the renamed old tables.

 

Answered By: S.Lott

To change primary key with south you can use south.db.create_primary_key command in datamigration.
To change your custom CharField pk to standard AutoField you should do:

1) create new field in your model

class MyModel(Model):
    id = models.AutoField(null=True)

1.1) if you have a foreign key in some other model to this model, create new fake fk field on these model too (use IntegerField, it will then be converted)

class MyRelatedModel(Model):
    fake_fk = models.IntegerField(null=True)

2) create automatic south migration and migrate:

./manage.py schemamigration --auto
./manage.py migrate

3) create new datamigration

./manage.py datamigration <your_appname> fill_id

in tis datamigration fill these new id and fk fields with numbers (just enumerate them)

    for n, obj in enumerate(orm.MyModel.objects.all()):
        obj.id = n
        # update objects with foreign keys
        obj.myrelatedmodel_set.all().update(fake_fk = n)
        obj.save()

    db.delete_primary_key('my_app_mymodel')
    db.create_primary_key('my_app_mymodel', ['id'])

4) in your models set primary_key=True on your new pk field

id = models.AutoField(primary_key=True)

5) delete old primary key field (if it is not needed) create auto migration and migrate.

5.1) if you have foreign keys – delete old foreign key fields too (migrate)

6) Last step – restore fireign key relations. Create real fk field again, and delete your fake_fk field, create auto migration BUT DO NOT MIGRATE(!) – you need to modify created auto migration: instead of creating new fk and deleting fake_fk – rename column fake_fk

# in your models
class MyRelatedModel(Model):
    # delete fake_fk
    # fake_fk = models.InegerField(null=True)
    # create real fk
    mymodel = models.FoeignKey('MyModel', null=True)

# in migration
    def forwards(self, orm):
        # left this without change - create fk field
        db.add_column('my_app_myrelatedmodel', 'mymodel',
                  self.gf('django.db.models.fields.related.ForeignKey')(default=1, related_name='lots', to=orm['my_app.MyModel']),keep_default=False)

        # remove fk column and rename fake_fk
        db.delete_column('my_app_myrelatedmodel', 'mymodel_id')
        db.rename_column('my_app_myrelatedmodel', 'fake_fk', 'mymodel_id')

so previously filled fake_fk becomes a column, that contain actual relation data, and it does not get lost after all the steps above.

Answered By: user920391

I had the same problem to day and came to a solution inspired by the answers above.

My model has a “Location” table. It has a CharField called “unique_id” and I foolishly made it a primary key, last year. Of course they didn’t turn out to be as unique as expected at the time. There is also a “ScheduledMeasurement” model that has a foreign key to “Location”.

Now I want to correct that mistake and give Location an ordinary auto-incrementing primary key.

Steps taken:

  1. Create a CharField ScheduledMeasurement.temp_location_unique_id and a model TempLocation, and migrations to create them. TempLocation has the structure I want Location to have.

  2. Create a data migration that sets all the temp_location_unique_id’s using the foreign key, and that copies over all the data from Location to TempLocation

  3. Remove the foreign key and the Location table with a migration

  4. Re-create the Location model the way I want it to be, re-create the foreign key with null=True. Renamed ‘unique_id’ to ‘location_code’…

  5. Create a data migration that fills in the data in Location using TempLocation, and fills in the foreign keys in ScheduledMeasurement using temp_location

  6. Remove temp_location, TempLocation and null=True in the foreign key

And edit all the code that assumed unique_id was unique (all the objects.get(unique_id=…) stuff), and that used unique_id otherwise…

Answered By: RemcoGerlich

I managed to do this with django 1.10.4 migrations and mysql 5.5, but it wasn’t easy.

I had a varchar primary key with several foreign keys. I added an id field, migrated data and foreign keys. This is how:

  1. Adding future primary key field. I added an id = models.IntegerField(default=0) field to my main model and generated an auto migration.
  2. Simple data migration to generate new primary keys:

    def fill_ids(apps, schema_editor):
       Model = apps.get_model('<module>', '<model>')
       for id, code in enumerate(Model.objects.all()):
           code.id = id + 1
           code.save()
    
    class Migration(migrations.Migration):
        dependencies = […]
        operations = [migrations.RunPython(fill_ids)]
    
  3. Migrating existing foreign keys. I wrote a combined migration:

    def change_model_fks(apps, schema_editor):
        Model = apps.get_model('<module>', '<model>')  # Our model we want to change primary key for
        FkModel = apps.get_model('<module>', '<fk_model>')  # Other model that references first one via foreign key
    
        mapping = {}
        for model in Model.objects.all():
            mapping[model.old_pk_field] = model.id  # map old primary keys to new
    
        for fk_model in FkModel.objects.all():
            if fk_model.model_id:
                fk_model.model_id = mapping[fk_model.model_id]  # change the reference
                fk_model.save()
    
    class Migration(migrations.Migration):
        dependencies = […]
        operations = [
            # drop foreign key constraint
            migrations.AlterField(
                model_name='<FkModel>',
                name='model',
                field=models.ForeignKey('<Model>', blank=True, null=True, db_constraint=False)
            ),
    
            # change references
            migrations.RunPython(change_model_fks),
    
            # change field from varchar to integer, drop index
            migrations.AlterField(
                model_name='<FkModel>',
                name='model',
                field=models.IntegerField('<Model>', blank=True, null=True)
            ),
        ]
    
  4. Swapping primary keys and restoring foreign keys. Again, a custom migration. I auto-generated the base for this migration when I a) removed primary_key=True from the old primary key and b) removed id field

    class Migration(migrations.Migration):
        dependencies = […]
        operations = [
            # Drop old primary key
            migrations.AlterField(
                model_name='<Model>',
                name='<old_pk_field>',
                field=models.CharField(max_length=100),
            ),
    
            # Create new primary key
            migrations.RunSQL(
                ['ALTER TABLE <table> CHANGE id id INT (11) NOT NULL PRIMARY KEY AUTO_INCREMENT'],
                ['ALTER TABLE <table> CHANGE id id INT (11) NULL',
                 'ALTER TABLE <table> DROP PRIMARY KEY'],
                state_operations=[migrations.AlterField(
                    model_name='<Model>',
                    name='id',
                    field=models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID'),
                )]
            ),
    
            # Recreate foreign key constraints
            migrations.AlterField(
                model_name='<FkModel>',
                name='model',
                field=models.ForeignKey(blank=True, null=True, to='<module>.<Model>'),
        ]
    
Answered By: Nikolay Markov

I had to migrate some keys in my Django 1.11 application − the old keys were deterministic, based on an external model. Later though, it turned out that this external model might change, so I needed my own UUIDs.

For reference, I was changing a table of POS-specific wine bottles, as well as a sales table for those wine bottles.

  • I created an extra field on all the relevant tables. In the first step, I needed to introduce fields that could be None, then I generated UUIDs for all of them. Next I applied a change through Django where the new UUID field was marked as unique. I could start migrating all the views etc to use this UUID field as a lookup, so that less would need to be changed during the upcoming, scarier phase of the migration.
  • I updated the foreign keys using a join. (in PostgreSQL, not Django)
  • I replaced all mention of the old keys with the new keys and tested it out in unit tests, since they use their own separate, testing database. This step is optional for cowboys.
  • Going to your PostgreSQL tables, you’ll notice that the foreign key constraints have codenames with numbers. You need to drop those constraints and make new ones:

    alter table pos_winesale drop constraint pos_winesale_pos_item_id_57022832_fk;
    alter table pos_winesale rename column pos_item_id to old_pos_item_id;
    alter table pos_winesale rename column placeholder_fk to pos_item_id;
    alter table pos_winesale add foreign key (pos_item_id) references pos_poswinebottle (id);
    alter table pos_winesale drop column old_pos_item_id;
    
  • With the new foreign keys in place, you can then change the primary key, since nothing references it anymore:

    alter table pos_poswinebottle drop constraint pos_poswinebottle_pkey;
    alter table pos_poswinebottle add primary key (id);
    alter table pos_poswinebottle drop column older_key;
    
  • Fake the migration history.

Answered By: kokociel

I just tried this approach and it seems to work, for django 2.2.2, but only work for sqlite. Trying this method on other database such as postgres SQL but does not work.

  1. Add id=models.IntegerField() to model, makemigrations and migrate, provide a one off default like 1

  2. Use python shell to generate id for all objects in model from 1 to N

  3. remove primary_key=True from the primary key model and remove id=models.IntegerField(). Makemigration and check the migration and you should see id field will be migrate to autofield.

It should work.

I didn’t know what i was doing with putting primary key into one of the field but if unsure how to handle primary key, I think better off letting Django to take care of it for you.

Answered By: Huan Ran Ng

I would like to share my case: The column email was the primary key, but now that’s wrong. I need to change the primary key to another column. After trying some suggestions, I finally came up with the most simple solution:

  1. First, drop the old primary key. This step requires custom the migrations a bit:
  • edit the model to replace primary_key=True on email column by blank=True, null=True
  • run makemigrations to create a new migration file and edit it like this:
class Migration(migrations.Migration):

    dependencies = [
        ('api', '0026_auto_20200619_0808'),
    ]
    operations = [
        migrations.RunSQL("ALTER TABLE api_youth DROP CONSTRAINT api_youth_pkey"),
        migrations.AlterField(
            model_name='youth', name='email',
            field=models.CharField(blank=True, max_length=200, null=True))
    ]

  • run migrate
  1. Now your table has no primary key, you can add a new column or user an old column to be a primary key. Just change the model then migrate. Do some extra script if you need a new column to fill and make sure it includes unique values only.
Answered By: Vu Viet Hung

Adding further context to the answers already here. To change the primary key:

From:

email = models.EmailField(max_length=255, primary_key=True,)

To:

id = models.AutoField(auto_created=True, primary_key=True)    
email = models.EmailField(max_length=255,)

Create the first migration:

migrations.AddField(
    model_name='my_model',
    name='id',
    field=models.AutoField(auto_created=True, primary_key=True, serialize=False),
    preserve_default=False,
),
migrations.AlterField(
    model_name='my_model',
    name='email',
    field=models.EmailField(max_length=255,),
),

Modify the migration
Flip the order so that the email field is modified first. This prevents the "Multiple primary keys for table “my_model” are not allowed"

migrations.AlterField(
    model_name='my_model',
    name='email',
    field=models.EmailField(max_length=255,),
),
migrations.AddField(
    model_name='my_model',
    name='id',
    field=models.AutoField(auto_created=True, primary_key=True, serialize=False),
    preserve_default=False,
),
Answered By: David Kobia

I managed to achieve this by creating three migrations. I started with the following model:

class MyModel(models.Model):
  id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
  created_at = models.DateTimeField(auto_now_add=True)

First, we need a migration to rename the primary key field and add a new id placeholder IntegerField:

class Migration(migrations.Migration):

    dependencies = [
        ('myapp', '0001_initial'),
    ]

    operations = [
        migrations.RenameField(
            model_name='mymodel',
            old_name='id',
            new_name='uuid',
        ),
        migrations.AddField(
            model_name='mymodel',
            name='new_id',
            field=models.IntegerField(null=True),
        ),
    ]

Now in the next migration we need to backfill the id IntegerField according to the order we want (I’ll use the created_at timestamp).

def backfill_pk(apps, schema_editor):
    MyModel = apps.get_model('myapp', 'MyModel')
    curr = 1
    for m in MyModel.objects.all().order_by('created_at'):
        m.new_id = curr
        m.save()
        curr += 1


class Migration(migrations.Migration):

    dependencies = [
        ('myapp', '0002_rename_pk'),
    ]

    operations = [
        migrations.RunPython(backfill_pk, reverse_code=migrations.RunPython.noop),
    ]

And then finally we need to alter the uuid and id fields to their proper final configuration (note the order of operations below is important):

class Migration(migrations.Migration):

    dependencies = [
        ('myapp', '0003_backfill_pk'),
    ]

    operations = [
        migrations.AlterField(
            model_name='mymodel',
            name='uuid',
            field=models.UUIDField(db_index=True, default=uuid.uuid4, editable=False, unique=True),
        ),
        migrations.AlterField(
            model_name='mymodel',
            name='new_id',
            field=models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID'),
        ),
        migrations.RenameField(
            model_name='mymodel',
            old_name='new_id',
            new_name='id',
        ),
    ]

The final model state will look like this (the id field is implicit in Django):

class MyModel(models.Model):
  uuid = models.UUIDField(default=uuid.uuid4, db_index=True, editable=False, unique=True)
  created_at = models.DateTimeField(auto_now_add=True)
Answered By: swinters