How do I replicate this .NET generated key using Python?

Question:

I am using the following .NET code to generate a key from a password and salt:

static byte[] GenerateKey(string password, string salt, int size)
{
    var saltBytes = Encoding.Unicode.GetBytes(salt);
    var derivedBytes = new Rfc2898DeriveBytes(password, saltBytes, iterations);
    var key = derivedBytes.GetBytes(size);

    Console.WriteLine(string.Format("Key: {0}", Convert.ToBase64String(key)));

    return key;            
}

// Console.Writeline() shows 
// Key: tb6yBBYGdZhyFWrpWQ5cm5A1bAI5UF0KnDdom7BhVz0= 
// for password="password" and salt="salt"

I need to decode the encoded message using python, a language I am only slightly familiar with, using the same password and salt. Thanks to @Topaco I now know that there is the PBKDF2 equivalent:

def decrypt_file(filename, password, salt):

    key = PBKDF2(password, salt, 32, count=12345, hmac_hash_module=SHA1)

  print(f"Key: {base64.b64encode(key).decode('utf-8')}");
  # more lines redacted 
 
# print() shows
# Key: 3ohW9ctQIXoNvGnvLaKmoQTG8/jJzoFThviHXqgM9Co=
# for password="password" and salt="salt"

I’m having some trouble getting the same key from both implementations. I am not well-versed in python’s encoding and decoding; it’s entirely likely possible that I am generating the same key but the base64.b64encode(key).decode('utf-8') line is showing me a different translation.

What am I doing wrong here?

Asked By: Scott Baker

||

Answers:

You have to encode the salt with UTF-16LE since Encoding.Unicode corresponds to UTF-16LE. The rest is fine:

from Crypto.Protocol.KDF import PBKDF2
from Crypto.Hash import SHA1
import base64

password = b'password'
salt = 'salt'.encode('utf-16le')
key = PBKDF2(password, salt, 32, count=12345, hmac_hash_module=SHA1)
print(base64.b64encode(key).decode('utf-8')) # tb6yBBYGdZhyFWrpWQ5cm5A1bAI5UF0KnDdom7BhVz0=

For completeness: If the password contains non-ASCII characters, the password in the Python code must be encoded with the more specific '...'.encode('utf-8'), since the Rfc2898DeriveBytes overload used in the C# code encodes the password string with UTF-8.

Answered By: Topaco