The idea behind mapped types is that there are some C++ types that the library uses that Python programmers shouldn't need to worry about. The big, obvious example is C++ strings. So, the bindings should silently convert such types to and from Python types. Of course, there is no way to automatically create the code to convert the mapped types; it has to be supplied to the generator. The existing wxPython has hand written .sip files that contain the code. Since the cffi generator doesn't have an intermediary format like the .sip files, it will have cffi-specific tweaker scripts that include the code.
For sip, the actual conversion code can be much simpler than it can for the cffi generator. CPython's api means that one block of code can manipulate both the C++ objects and the Python objects. Its not really possible to do the same thing in the cffi bindings. Code for interacting with the mapped types directly from Python isn't generated because the user won't need to interact with the objects and because no information about the interface of the mapped type is provided to the generator. So instead, the solution I came up with was to split the conversion (in each direction) into two parts: a to C conversion and a from C conversion. The idea is that, using cffi, C data types can be an intermediary between the Python and C++ types. This gives us Python->C and C->Python code that is written in Python and the C++->C and C->C++ code is written in C++. Additionally, the to/from Python code is called only from Python and to/from C++ code is called only from C++ (with one exception,) reducing the total number boundary crossing.
An example of what the conversion code can look like for wxString:
//Cpp2C //malloc must be used instead of new so that this data can be freed from Python char *cstr = (char*)malloc(cpp_obj->length()); strcpy(cstr, cpp_obj->c_str()); return cstr;
//C2Cpp //We don't have to free the cdata here because it was allocated from ffi.new return new wxString(cdata);
# Py2C cdata = ffi.new('char[]', obj) # Py2C always returns two values: the actual cdata and a keepalive object. The # latter is needed when creating, for example an array of strings return (cdata, None)
# C2Py obj = ffi.string(cdata) # Explicit freeing is necessary here unfortunately clib.free(cdata) return obj
A couple of comments about the above code: C->C++ code should always return a pointer to a heap allocated object so that the same block of code can be used even when the library will expect to take ownership of new object. While using ffi.new in the Python->C code will allow C->C++ to not have to do any freeing of memory, the C->Python code will always have to cleanup after C++->C code.
Of course, there are a few issues with this approach. First of all, there is some overhead associated with allocating objects on the heap and then almost immediately freeing them. But, compared the other alternative I came up with, this one involves fewer boundary-crossing and preforms much better. The second problem arises from virtual methods messing things up a little. In all other circumstances, the Python->C code can use ffi.new to allocate the objects that get passed to the native function, but in a virtual function, the object has to be returned rather than passed as a parameter. This prevents us from doing the C->C++ conversion in the C++ virtual function. By then, the Python-created c data will be out of scope and potentially garbage collected. So in the virtual method handler, calling the C->C++ conversion from Python code is necessary (this is the exception I mentioned.) The last issue, also related to virtual methods, is one that I fear is simply unsolvable. When a virtual method returns a pointer to a mapped type, there is have no way of making sure that the object we allocate in the C->C++ conversion code is ever deleted. I think this is a problem sip has too, though.
So, now that I have mapped types taken care of, I plan to continue working on implementing the various annotations. Expect my next post to be a bunch about them.