Workaround for Cython bindings library to C++ that lacks nullary constructor?

Question:

I’m working on a set of Cython bindings to expose a C++ compression library in
python. The library uses the Pimpl pattern and, in particular, has no default
empty constructors. I’m wrestling with using a class method that returns a
CompressionConfig object, which I can’t allocate on the stack in Cython due
to the missing empty constructor.

The class with the CompressionConfig return has the following signature:

    cdef cppclass LZ4Manager "nvcomp::LZ4Manager":
        LZ4Manager (
            size_t uncomp_chunk_size,
            nvcompType_t data_type,
            cudaStream_t user_stream,
            const int device_id
        ) except +
        CompressionConfig configure_compression (
            const size_t decomp_buffer_size
        ) except +
        void compress(
            const uint8_t* decomp_buffer, 
            uint8_t* comp_buffer,
            const CompressionConfig& comp_config
        ) except +
        DecompressionConfig configure_decompression (
            const uint8_t* comp_buffer
        ) except +
        DecompressionConfig configure_decompression (
            const CompressionConfig& comp_config
        ) except +
        void decompress(
            uint8_t* decomp_buffer, 
            const uint8_t* comp_buffer,
            const DecompressionConfig& decomp_config
        ) except +
        void set_scratch_buffer(uint8_t* new_scratch_buffer) except +
        size_t get_required_scratch_buffer_size() except +
        size_t get_compressed_output_size(uint8_t* comp_buffer) except +

configure_compression is giving me trouble now. It’s class signature is:

    cdef cppclass CompressionConfig "nvcomp::CompressionConfig":
        const size_t uncompressed_buffer_size 
        const size_t max_uncompressed_buffer_size 
        const size_t num_chunks 
        CompressionConfig(
            PinnedPtrPool[nvcompStatus_t]* pool,
            size_t uncompressed_buffer_size) except +
        nvcompStatus_t* get_status() const
        CompressionConfig (CompressionConfig&& other) except +
        CompressionConfig (const CompressionConfig& other) except +
        CompressionConfig& operator= (CompressionConfig&& other) except +
        CompressionConfig& operator= (const CompressionConfig& other) except +

I’m trying to find the right Cython invocation that will let me use one of the
existing constructors for the CompressionConfig object as called from LZ4Manager
and store it in a pointer:

I can’t use the obvious base case because:

cdef class _LZ4Compressor:
    cdef CompressionConfig _config
    cdef configure_compression(self, const size_t decomp_buffer_size):
        self._config = self._impl.configure_compression(decomp_buffer_size)
    
Error: C++ class must have a nullary constructor to be stack allocated

So I’m trying to discover workarounds and becoming concerned there isn’t one. The next most
obvious solution seems to be:

cdef class _LZ4Compressor:
    cdef CompressionConfig* _config
    cdef configure_compression(self, const size_t decomp_buffer_size):
        self._config = new CompressionConfig(
            self._impl.configure_compression(decomp_buffer_size)
        )

Error: ambiguous overloaded method

Normally I’d just think that this is because Cython can’t decide which constructor to use
which is easy to resolve by explicitly casting the object:

    cdef configure_compression(self, const size_t decomp_buffer_size):
        self._config = new CompressionConfig(
            <ConfigureCompression&&>self._impl.configure_compression(decomp_buffer_size)
        )
        
Error: ambiguous overloaded method

But the specific constructor still can’t be identified. I need help figuring out
a workaround for the fact that the library I’m using doesn’t provide the nullary
constructor, but deals frequently with stack allocated C++ objects. Is there any way
for me to wrap the inner self._impl.configure_compression(decomp_buffer_size) that
will prevent Cython from trying to store it in a temporary lvalue
ConfigureCompression object when the .cxx is written?

I have been able to get the .cxx to compile by cramming the return
value into complicated nestings of shared_ptr[CompressionConfig*]
but the .cxx still has the stack allocated CompressionConfig
temporary object. I’ll try to get back into that state, share the
shape of it here, then for now I need to drop the C++ wrapper and
work on the C bindings which won’t have the same challenges.

Asked By: Thomson Comer

||

Answers:

 cdef cppclass CompressionConfig "nvcomp::CompressionConfig":
        CompressionConfig (CompressionConfig&& other) except +
        CompressionConfig (const CompressionConfig& other) except +
        CompressionConfig& operator= (CompressionConfig&& other) except +
        CompressionConfig& operator= (const CompressionConfig& other) except +

I think part of the issue is that Cython doesn’t recognised rvalue references (except in very limited contexts) and so it can’t tell whether you’re trying to call
CompressionConfig (CompressionConfig&& other) except + or CompressionConfig (const CompressionConfig& other) except +. I suggest you just declare it once as CompressionConfig (CompressionConfig& other) and it’ll generate the right code (even if it doesn’t quite match the C++ declarations).

Note that if you use recent Cython 3 alpha then it’ll insert std::move where it can to try to avoid copies where possible.


The other Cython 3 alpha feature you might try is the cpp_locals directive. That implements the code in terms of std::optional (essentially with the aim of making C++ variables behave more like Python variables and not default-constructing them at the start of each scope). It’s fairly new though so probably not hugely well tested at this stage, but it might well do what you need without implementing everything in pointers.

Answered By: DavidW

My colleague Ashwin Srinath provided me with a solution. It uses the move and shared_ptr semantics in two steps:

cdef shared_ptr[CompressionConfig] _config
cdef shared_ptr[CompressionConfig] partial = make_shared[CompressionConfig[(
    self._impl.configure_compression(decomp_buffer_size)
)

Stores a shared_ptr to the partial and avoids placing the first result on the stack by loading it with make_shared.

self._config = make_shared[CompressionConfig](move(partial.get()[0]))

gets the reference to the actual CompressionConfig object, then moves it into self._config

Answered By: Thomson Comer