Auto incrementing composite key
Question:
I have a relatively simple need — I have a table with a composite primary key that includes an integer index to be autoincremented. However, I cannot figure out how to do this.
I have read the documentation which suggests the following
U().aggr(Scan & key, next='max(scan_idx)+1')
# or
Session.aggr(Scan, next='max(scan_idx)+1') & key
However I find these lines to be extremely cryptic and haven’t been able to find any further documentation of the aggr method or what next= means. Can someone please provide a minimal example illustrating their use? Thanks.
Answers:
First, automatic generation of primary key attributes for real-world entities generally ought to be avoided. The purpose of the primary key is entity integrity: to establish a 1:1 correspondence between real things and their digital representations in the database. This requires assigning identifying information in the external world. Auto-increment can make it too easy breaking this 1:1 association. However, there are many cases when auto-increment is appropriate, especially in cases where the entity only exists in the application with no real-world counterpart.
Second, this type of client-side auto-increment is racy (as already noted by a commenter), so it needs to be done inside a transaction. DataJoint’s make
functions are executed within transactions, so you are good there. But in other cases, you can use the transaction context of the dj.Connection()
object. The aggr
method is the right method for incrementing but needs explaining. I will provide a complete example here.
Let’s consider the definition for Session
and Scan
as in your original question. A session can have multiple scans.
@schema
class Session(dj.Manual):
definition = """
session : int
---
session_info : varchar(1000)
"""
@schema
class Scan(dj.Manual):
definition = """
-> Session
scan : int
---
scan_info="": varchar(1000)
"""
Here Scan has a composite primary key ('session', 'scan')
.
Let’s populate Session
with a few entries:
Session.insert(
(1, "Awesome"),
(2, "Splendid")))
Now let’s insert scans for a specific session with auto incrementing scan
.
session_key = {'session': 1} # select session
with dj.conn().transaction:
# insert scan with a new id
scan_ids = (Scan & session_key).fetch('scan')
next_scan = max(scan_ids, default=0) + 1
Scan.insert1(dict(session_key, scan=next_scan, scan_info="wow!"))
Note that the with
block constitutes an isolated atomic transaction. This prevents any conflicts introduced by concurrent transactions. Transactions cannot be nested. Therefore, if this code is already inside a transaction, then omit the with
statement.
The aggr
query allows performing the max operation on the server side. This may be preferable if the number of scan_ids is high and you would like to avoid sending the entire set of existing ids over to the client.
I have a relatively simple need — I have a table with a composite primary key that includes an integer index to be autoincremented. However, I cannot figure out how to do this.
I have read the documentation which suggests the following
U().aggr(Scan & key, next='max(scan_idx)+1')
# or
Session.aggr(Scan, next='max(scan_idx)+1') & key
However I find these lines to be extremely cryptic and haven’t been able to find any further documentation of the aggr method or what next= means. Can someone please provide a minimal example illustrating their use? Thanks.
First, automatic generation of primary key attributes for real-world entities generally ought to be avoided. The purpose of the primary key is entity integrity: to establish a 1:1 correspondence between real things and their digital representations in the database. This requires assigning identifying information in the external world. Auto-increment can make it too easy breaking this 1:1 association. However, there are many cases when auto-increment is appropriate, especially in cases where the entity only exists in the application with no real-world counterpart.
Second, this type of client-side auto-increment is racy (as already noted by a commenter), so it needs to be done inside a transaction. DataJoint’s make
functions are executed within transactions, so you are good there. But in other cases, you can use the transaction context of the dj.Connection()
object. The aggr
method is the right method for incrementing but needs explaining. I will provide a complete example here.
Let’s consider the definition for Session
and Scan
as in your original question. A session can have multiple scans.
@schema
class Session(dj.Manual):
definition = """
session : int
---
session_info : varchar(1000)
"""
@schema
class Scan(dj.Manual):
definition = """
-> Session
scan : int
---
scan_info="": varchar(1000)
"""
Here Scan has a composite primary key ('session', 'scan')
.
Let’s populate Session
with a few entries:
Session.insert(
(1, "Awesome"),
(2, "Splendid")))
Now let’s insert scans for a specific session with auto incrementing scan
.
session_key = {'session': 1} # select session
with dj.conn().transaction:
# insert scan with a new id
scan_ids = (Scan & session_key).fetch('scan')
next_scan = max(scan_ids, default=0) + 1
Scan.insert1(dict(session_key, scan=next_scan, scan_info="wow!"))
Note that the with
block constitutes an isolated atomic transaction. This prevents any conflicts introduced by concurrent transactions. Transactions cannot be nested. Therefore, if this code is already inside a transaction, then omit the with
statement.
The aggr
query allows performing the max operation on the server side. This may be preferable if the number of scan_ids is high and you would like to avoid sending the entire set of existing ids over to the client.