reimplementing crypto-js in python cryptography

Question:

I want to implement the following code (written in JavaScript, using the crypto-js library) in Python (using the cryptography package)

const data = 'base 64 encoded encrypted data';
const salt = 'XB7sHH26Hn&FmPLxnjGccKTfPV(yk';
const pass = '.. password ..';
const psw = `${PBKDF2(pass, salt)}(tXntTbJFzh]4EuQVmjzM9GXHCth8`;
let data = AES.decrypt(data, psw).toString(enc.Utf8);

Here’s what I have so far on the python front.

data = 'base 64 encoded encrypted data'
password = b'.. password ..'
salt = b'XB7sHH26Hn&FmPLxnjGccKTfPV(yk'
kdf = PBKDF2HMAC(
    algorithm=hashes.SHA1(),
    length=16,
    salt=salt,
    iterations=1
)
psw = kdf.derive(password)

key = lh.hex() + '(tXntTbJFzh]4EuQVmjzM9GXHCth8'
key = key.encode('utf-8')
cipher = Cipher(algorithms.AES(key), modes.CBC([0] * 16))
decryptor = cipher.decryptor()
res = decryptor.update(data) + decryptor.finalize()

There are several things amiss:

  1. Currently, the values of key and psw are the same, but I don’t know if crypto-js has hex encoded strings as the data format.
  2. I don’t know what to supply as the IV, not sure what this is in crypto-js.
  3. The Cipher creation will fail with ValueError: Invalid key size (488) for AES. The key size (len(key)) is 61 bytes.
Asked By: linkedin

||

Answers:

The CryptoJS code uses the key derivation function PBKDF2 to derive key material from a constant salt and a password. This key material is passed to CryptoJS.AES.encrypt() as string, whereby it is interpreted as password and the internal key derivation function EVP_BytesToKey() is applied.
EVP_BytesToKey() generates a random 8 bytes salt during encryption and derives a 32 bytes key and a 16 bytes IV based on salt and password.
The result of CryptoJS.AES.encrypt(...).toString() is the Base64 encoding of the ASCII encoding of Salted__ plus the salt plus the actual ciphertext.

In the posted Python code there are the following flaws:

  • Typo: lh is not defined and must be replaced by psw

  • Separation of salt and ciphertext and derivation of key and IV are missing. This needs to be implemented:

    import base64
    ...
    encrypted = base64.b64decode(data)
    salt = encrypted[8:16]
    ciphertext = encrypted[16:]
    keyIv = bytesToKey(salt, key)
    key = keyIv[:32]
    iv = keyIv[32:]
    

    with the following possible implementation for EVP_BytesToKey():

    from cryptography.hazmat.primitives import hashes
    ...
    def bytesToKey(salt, password):
      bytes = b''
      last = b''
      while len(bytes) < 48:
        md5 = hashes.Hash(hashes.MD5())
        md5.update(last + password + salt)
        last = md5.finalize()
        bytes += last
      return bytes
    
  • During decryption, the separated ciphertext and the derived key and IV must be used:

    cipher = Cipher(algorithms.AES(key), modes.CBC(iv)) 
    decryptor = cipher.decryptor()
    res = decryptor.update(ciphertext) + decryptor.finalize()  
    
  • The padding must be removed:

  from cryptography.hazmat.primitives import padding
  ...
  unpadder = padding.PKCS7(128).unpadder()
  decrypted = unpadder.update(res) + unpadder.finalize()
  print(decrypted.decode('utf8'))

The design of the CryptoJS code for deriving the key and IV is unnecessarily cumbersome and should be changed to apply exclusively PBKDF2 in conjunction with a random salt. EVP_BytesToKey() is considered insecure, so it should be avoided entirely. A constant salt is a vulnerability. Instead, a random salt should be generated during encryption with a length of e.g. 16 bytes. Salt and ciphertext should be concatenated, e.g. salt|ciphertext.
Alternatively, the IV can be decoupled from the key derivation by generating it randomly. In this case, the concatenation would be e.g. salt|iv|ciphertext. Note that salt and IV are not secret, so their disclosure is not critical.
In this design, key and IV are thus derived directly with PBKDF2, whereby the iteration count in particular must be increased to the maximum while maintaining acceptable performance (1 is much too low). As digest, SHA256 should be used instead of the less secure SHA1 (although SHA1 is not a known security risk in the context of HMAC).

Answered By: Topaco