Python3 Pass parameters in the right order

Question:

Firstly, I am unsure how to title this as I can’t think of how to even describe my issue. (let me know and I’ll update the title when I can)

Well, What I am trying to do is make sure that the parameters being passed to a function aren’t mixed up. This will be easier to understand with some code.. The following is the code I am working with.

#clearing terminal.
def clear():
    os.system("clear||cls")


#Make master key for encrypting stuff.
def keygen(master):
    if len(master) < 100:
        clear()
        input('Password/characters used must be 100 characters in length or more!nnPress "eneter" to continue...')
        clear()
        return None
    else:
        salt = os.urandom(16)

        # derive | DO NOT MESS WITH...unless you know what you are doing and or have more than 8GB of ram to spare and a really good CPU.
        print("Generating key...")
        with alive_bar(0) as bar:
            key = argon2.hash_password_raw(
                time_cost=16,
                memory_cost=2**20,
                parallelism=4,
                hash_len=32,
                password=master,
                salt=salt,
                type=argon2.Type.ID
            )
            bar()
        clear()
        return key #returns bytes. You will need to base64 encode them yourself if you want a "shareable key"



# Encrypting the passwords with master key and AES encryption.
def stringE(data, key):
    cipher = AES.new(key, AES.MODE_GCM)
    cipher.update(header)
    ciphertext, tag = cipher.encrypt_and_digest(data)
    json_k = [ 'nonce', 'header', 'ciphertext', 'tag' ]
    json_v = [ b64.b64encode(x).decode('utf-8') for x in [cipher.nonce, header, ciphertext, tag ]]
    result = json.dumps(dict(zip(json_k, json_v)))
    result_bytes = bytes(result, 'utf-8')
    b64_result = b64.b64encode(result_bytes)
    return b64_result.decode()




#Encrypting
data = b'Hello World <3'
key_data = b'abcdeabcdeabcdeabcdeabcdeabcdeabcdeabcdeabcdeabcdeabcdeabcdeabcdeabcdeabcdeabcdeabcdeabcdeabcdeabcde'

eKey = keygen(key_data) #Returns bytes and will return "None" if what's provided is less than 100 characters.
save_me = base64.b64encode(eKey) #for saving eKey to decrypt later.

input(f'Save this key so you can decrypt later: {save_me}nnPress "enter" to contine...')
clear() #clears terminal or cmd/powershell window.

data_enc = stringE(data, eKey) #It is CRITICAL that you have the data you want to encrypt FIRST then the key.
clear()
print(data_enc) # Output is b64 encoded. --> eyJub25jZSI6ICJjQjhpWmc0MWhDWXBRVXdVdW53Q0pRPT0iLCAiaGVhZGVyIjogIlJXNWpjbmx3ZEdWa0lIVnphVzVuSUVkRFRXeHBZaTRnUkU4Z1RrOVVJRlJCVFZCRlVpQlhTVlJJTGlBZ2ZDQWdUV0ZrWlNCaWVTQjBhR1Z5WldGc1QzSnBJQ0I4SUNCaUoxeDRNVE5jZUdKaFhIaGpaVng0TVdWY2VHRTRYSGhsT1VOY2VHRmxKdz09IiwgImNpcGhlcnRleHQiOiAiZ2FDSjY4N2FGVjNMMEIyb01Ecz0iLCAidGFnIjogIkJBUjlmVzkzaWFESnUwckpSU2o3VEE9PSJ9

In this example code We have a KDF and an encryption function and some example usage at the bottom. My issue is that I DON’T want a user to pass to the function the data they want to encrypt as the key and the key as what they want to encrypt.

For stringE(data, eKey) If data is "abc" and eKey is "xyz", I don’t want data to be "xyz" and eKey to be "abc". This can cause a big problem because "data" is what you want to have be encrypted and "eKey" are the bytes from the KDF to encrypt "data". If things get reversed then this can be bad and when you try to decrypt the "data"..you won’t get the data you wanted to encrypt in the first place or it may error. I don’t know how to solve this issue.

I would like to have a way to have what’s passed to the function be what it is supposed to be and passed to the function in the correct order. I don’t need the encrypt function to be working with 2 variables that are assigned the wrong data/mixing up the data and messing everything up. Hopefully what I am saying makes sense and is understandable. I just want the data for parameter 1 to be what it should be and the data for parameter 2 to also be what it should be.

If anyone can give some suggestions or can help, it’d be VERY appreciated! All I can think of is to have some if checks in the function itself to make sure, but I want to know if there are other/better ways to achieve what I want. Even if the answer it is a link to an already asked question that I couldn’t find or didn’t see, I’d still really appreciate it.

Asked By: Ori

||

Answers:

There’s no way to prevent this error, but there are a few things you can do to make such errors more obvious.

First, switch from regular parameters to keyword-only parameters:

def stringE(*, data, key):
    ...

string(data="...", key="...") may make it more obvious that you are attempting to specify a key as data or vice versa.

Second, use static type hints. Even though both parameters expect str values, you can use typing.NewType to create, well, new types to represent each.

from typing import NewType

Data = NewType("Data", str)
Key = NewType("Key", str)

def stringE(*, data: Data, key: Key):
    ...

At runtime, stringE(data="...", key="...") is still legal, but tools like mypy will reject it. It requires the two arguments be constructed using Data and Key, respectively, "forcing" the caller to do something like

stringE(data=Data('...'), key=Key('...'))

which could make it even more obvious when the wrong type of string is passed as an argument. At runtime, Data and Key simply return their argument, but the two are not interchangeable with str or with each other.

Taking this a step further, replace the new types with actual classes that can perform runtime tests on their arguments to ensure the data is correct. This requires you to have some runtime test that can distinguish the value of a string as a valid key or as valid data.

class Data:
    def __init__(self, d):
        ...


class Key:
    def __init__(self, d):
        ...


def stringE(*, data: Data, key: Key):
    ...


stringE(data=Data('...'), key=Key('...')

The difference here is that Data.__init__ and Key.__init__ can (unlike the NewTypes above) do more to determine if ... is actually a valid data or key.

Since the first thing stringE does with key is try to create a new AES object, consider requiring the caller to create that object, rather than passing an opaque str value.

aes_key = AES.new(eKey)
def stringE(*data, *cipher):
    ciphertext, tag = cipher.encrypt_and_digest(data)
    json_k = [ 'nonce', 'header', 'ciphertext', 'tag' ]
    json_v = [ b64.b64encode(x).decode('utf-8') for x in [cipher.nonce, header, ciphertext, tag ]]
    result = json.dumps(dict(zip(json_k, json_v)))
    result_bytes = bytes(result, 'utf-8')
    b64_result = b64.b64encode(result_bytes)
    return b64_result.decode()


aes_key = AES.new(eKey, AES.MODE_GCM)
aes_key.update(header)

data_enc = stringE(data=data, cipher=aes_key)
Answered By: chepner
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.