Allocation of Memory with python to pass to dll

Question:

I got a dll which expects a memory pointer to a C Type Byte Array. The dll will read and modify the Array and will also put some extra data at the end of the array.

How do I allocate 1MB memory as C Type Byte Array in python and get the pointer?

How can I write my C Type Byte Array in python to this pointer?

You are perhaps wondering why I want to do it this way, but this is unfortunately the only interface to this dll :/ and I have to do this in python.

Here is my current setup:

import ctypes

# Allocate Memory:
BUFFER_LENGTH = 20

# parse to pointer:
bytes_buffer = bytes([0x13, 0x02, 0x03, 0x04, 0x08, 0xA5]) # dummy data

size_in = len(bytes_buffer)
print(bytes_buffer)
# write binary data to memory in CTypes Byte Array
buffer_in = ctypes.cast(bytes_buffer, ctypes.POINTER(ctypes.c_char*BUFFER_LENGTH) )
adr = ctypes.pointer(buffer_in)
address = id(adr)

# get pointer as int32
pointer_data_hi = ctypes.c_uint32(address) 
pointer_data_lo = ctypes.c_uint32(address >> 32) 
print("in: hi: " + str(pointer_data_hi.value) + ", lo: " + str(pointer_data_lo.value) + ", size: " + str(size_in))

# Load dll
array_modifier = ctypes.windll.LoadLibrary("PythonArrayToDll/modify_array_example/x64/Debug/modify_array_example.dll")

# set pointer of array to dll memory:
array_modifier.setAddrLo(pointer_data_lo)
array_modifier.setAddrHi(pointer_data_hi)

# tell the dll to compute something from the data array:
array_modifier.modifyArray() # this is where it crashes with exception: access violation reading 0xFFFFFFFFFFFFFFFF


# display the results:
for i in range(BUFFER_LENGTH):
    print(buffer_in[i].value)

dll code (example):

#include <WinDef.h>
#include "pch.h"
#include "pointer_fnc.h"

#define DLL_EXPORT __declspec(dllexport)

int addrHi;
int addrLo;

extern "C"
{

    DLL_EXPORT void setAddrLo(int lo)
    {
        addrLo = lo;
    }

    DLL_EXPORT void setAddrHi(int hi)
    {
        addrHi = hi;
    }

    DLL_EXPORT void modifyArray()
    {
        BYTE* my_array = (BYTE*)decode_integer_to_pointer(addrHi, addrLo);

        my_array[0] = my_array[0] * 2;
        my_array[1] = 2;
        my_array[10] = my_array[0];
    }
}

with pointer_fnc.cpp providing:

void* decode_integer_to_pointer(int hi, int lo)
{
#if PTRDIFF_MAX == INT64_MAX
    union addrconv {
        struct {
            int lo;
            int hi;
        } base;
        unsigned long long address;
    } myaddr;
    myaddr.base.lo = lo;
    myaddr.base.hi = hi;
    return reinterpret_cast<void*>(myaddr.address);
#elif PTRDIFF_MAX == INT32_MAX
    return reinterpret_cast<void*>(lo);
#else
#error "Cannot determine 32bit or 64bit environment!"
#endif
}

dll is compiled as a 64 bit and a 64 bit python is used.

I hope you can help me 🙂

Asked By: LilumDaru

||

Answers:

Listing [Python.Docs]: ctypes – A foreign function library for Python.

Issues:

  • I shallowly browsed Functional Mock-up Interface , but I didn’t see any reference to the API. Anyway, this hi, lo approach is poor (looks like it’s from from the 16bit (segment, offset) era). Pointers exist for decades, and that’s precisely their purpose: handling memory addresses

    • There’s no size limit ("how far can one go" from the (start) address), meaning that the API consumers might end up accessing memory that they don’t own (Undefined Behavior – very likely to crash)

    The above make me think there’s a (big) misunderstanding about the API

  • Don’t mix Python and CTypes object addresses, they are not the same! [Python.Docs]: Built-in Functions – id(object) returns the Python wrapper object (PyObject) address, not the actual pointer (that you care about)

  • A very common one (missing functions argtypes, restype): [SO]: C function called from Python via ctypes returns incorrect value (@CristiFati’s answer), but it’s not affecting the current scenario

  • Since you’re on Win, take a look at [MS.Docs]: ULARGE_INTEGER union (winnt.h), no need to reinvent the wheel. Anyway, I removed that to be as platform independent as possible

  • Combining the 2 ints in a unsigned long long might have disastrous effects for a negative value (sign bit set) of hi

  • More minor ones (that don’t worth mentioning individually)

I prepared a small example.

dll00.c:

#include <stdint.h>
#include <stdio.h>


#if defined(_WIN32)
#  define DLL00_EXPORT_API __declspec(dllexport)
#else
#  define DLL00_EXPORT_API
#endif

#define BYTE unsigned char


static uint32_t addrLo = 0;
static uint32_t addrHi = 0;


#if defined(__cplusplus)
extern "C" {
#endif

DLL00_EXPORT_API void setAddrLo(int lo);
DLL00_EXPORT_API void setAddrHi(int hi);
DLL00_EXPORT_API void modifyArray();

#if defined(__cplusplus)
}
#endif


void setAddrLo(int lo) {
    addrLo = (uint32_t)lo;
}

void setAddrHi(int hi) {
    addrHi = (uint32_t)hi;
}

static BYTE* toPtr(uint32_t hi, uint32_t lo) {
#if SIZE_MAX == 0xffffffffffffffffull
    uint64_t quad = ((uint64_t)hi << 32) + lo;
    printf("C  - Addr: 0x%016llX, Hi: 0x%08X, Lo: 0x%08Xn", quad, hi, lo);
    return (BYTE*)quad;
#elif SIZE_MAX == 0xfffffffful
    printf("C  - Addr: 0x%016llX, Hi: 0x%08X, Lo: 0x%08Xn", lo, hi, lo);
    return (BYRE*)lo;
#else
#  error "Neither 64bit nor 32bit architecture"
#endif
}

void modifyArray() {  // A 'size' argument would be make sense to specify maximum array index.
    BYTE *addr = toPtr(addrHi, addrLo);
    if (addr == NULL) {
        printf("C  - NULL pointer!n");
        return;
    }
    addr[0] *= 2;
    addr[1] = 2;
    addr[10] = addr[0];
}

code00.py:

#!/usr/bin/env python

import sys
import ctypes as ct


DLL_NAME = "./dll00.{:s}".format("dll" if sys.platform[:3].lower() == "win" else "so")
BUF_LEN = 20


def main(*argv):
    dll = ct.CDLL(DLL_NAME)
    set_addr_lo = dll.setAddrLo
    set_addr_lo.argtypes = (ct.c_int,)
    set_addr_lo.restype = None
    set_addr_hi = dll.setAddrHi
    set_addr_hi.argtypes = (ct.c_int,)
    set_addr_hi.restype = None
    modify_array = dll.modifyArray
    modify_array.argtypes = ()
    modify_array.restype = None

    b = b"x20x03x20"
    Array = ct.c_char * BUF_LEN
    buf = Array(*b)
    print("PY - Array: {:}".format(list(ord(i) for i in buf)))
    addr = ct.addressof(buf)  # The reverse of what's done in the .dll (toPtr) - doesn't make much sense
    if sys.maxsize > 0x100000000:
        lo = addr & 0xffffffff
        hi = (addr >> 32) & 0xffffffff
    else:
        hi = 0
        lo = addr
    print("PY - Addr: 0x{:016X}, Hi: 0x{:08X}, Lo: 0x{:08X}".format(addr, hi, lo))
    set_addr_lo(lo)
    set_addr_hi(hi)
    modify_array()
    print("PY - Array: {:}".format(list(ord(i) for i in buf)))


if __name__ == "__main__":
    print("Python {:s} {:03d}bit on {:s}n".format(" ".join(elem.strip() for elem in sys.version.split("n")),
                                                   64 if sys.maxsize > 0x100000000 else 32, sys.platform))
    rc = main(*sys.argv[1:])
    print("nDone.")
    sys.exit(rc)

Output:

[cfati@CFATI-5510-0:e:WorkDevStackOverflowq068304564]> sopr.bat
### Set shorter prompt to better fit when pasted in StackOverflow (or other) pages ###

[prompt]> "c:Installpc032MicrosoftVisualStudioCommunity2019VCAuxiliaryBuildvcvarsall.bat" x64 >nul

[prompt]> dir /b
code00.py
dll00.c

[prompt]>
[prompt]> cl /nologo /MD /DDLL dll00.c  /link /NOLOGO /DLL /OUT:dll00.dll
dll00.c
   Creating library dll00.lib and object dll00.exp

[prompt]> dir /b
code00.py
dll00.c
dll00.dll
dll00.exp
dll00.lib
dll00.obj

[prompt]>
[prompt]> "e:WorkDevVEnvspy_pc064_03.08.07_test0Scriptspython.exe" code00.py
Python 3.8.7 (tags/v3.8.7:6503f05, Dec 21 2020, 17:59:51) [MSC v.1928 64 bit (AMD64)] 064bit on win32

PY - Array: [32, 3, 32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
PY - Addr: 0x00000203275B23B0, Hi: 0x00000203, Lo: 0x275B23B0
C  - Addr: 0x00000203275B23B0, Hi: 0x00000203, Lo: 0x275B23B0
PY - Array: [64, 2, 32, 0, 0, 0, 0, 0, 0, 0, 64, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Done.
Answered By: CristiFati

Thanks for the amazing answer! Was looking for an example like this. I would like to mention that had this warning while compiling dll00.c

format ‘%llX’ expects argument of type ‘long long unsigned int’, but argument 2 has type ‘uint64_t’  

so I changed on line 42 0x%016llX to 0x%016lX. For sake of completeness if you are a newbie like me to generate the dll in linux you need to run the following commands

gcc -c -fpic dll00.c
gcc -shared -o libdll.so dll00.o
Answered By: APaul31
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.