Wednesday, July 31, 2013

Progress update, virtual methods

I've been slowly adding features to the generator over the last week.  Its far from complete, but at the moment it works for generating classes and methods/functions.  Most recently I've been working on implementing overriding virtual methods with Python methods.  The implementation of this is  interesting, so I thought I talk about it a little.

The place to start, I suppose, is to explain why this is something that is useful.  The idea is that a subclass of a C++ class created in Python should be as useful as a regular C++ subclass.  Important to this is when the user overrides a method in the Python subclass, they should be able to expect it to act like they had overridden that method in a C++ subclass.  Which is to say that provided the method is virtual, when the method is called on the C++ instance from a base class pointer, the new implementation should be called.  An example is wxPython's App.OnInit (aka wxApp::OnInit.)  This method is not (typically) called by users themselves, but instead by the library.  It is intended to be overridden in a subclass to allow the user to perform some setup before the application starts.

The first step to implementing virtual methods is to generate a subclass of the C++ class we're wrapping.  When we create instances of the Python wrapper class, we'll create an instance of this subclass instead of the original C++ class.  The reason we want this subclass is so that we can re-implement every virtual method.  These re-implementations can then call Python functions via function pointers created using cffi.  The Python functions look up the Python wrapper object for the `this` pointer from the virtual method, convert the arguments into Python types, and then call the corresponding method on the Python wrapper.  The value returned by the Python callback is then converted as necessary and returned back to virtual method.

That seems simple enough, but we have a major problem: there is a bunch of overhead associated with calling a Python callback from C/C++ and with converting the arguments.  In fact, most of time the virtual methods aren't overridden and all of that overhead is for nothing.  Since we're already creating a subclass, we can add a set of flags - one for each virtual method - to track whether or not a method has been overridden in Python.  A virtual re-implementation can then simply call the base implementation of the method if the corresponding flag has not been set.  We can then use a metaclass to figure out which methods have been overridden by a Python subclass and build a set of default flags for instances of that subclass.

That handles the more C++-like use case of a subclass overriding a method from its superclass, but what about something a bit more Pythonic: replacing a method on a single instance. We already have most of what we need in place for this. We just need to update the flag when a Python method corresponding to a virtual method gets changed.  The solution that I ended up using was to use a descriptor to wrap the virtual methods.  When the descriptor's __set__ method is triggered, it sets the flag on the object, but only if the object was created by Python.  The reason for that last check is because we can't guarantee that a object that wasn't created by Python will have the flags field and if it doesn't, trying to set a flag could cause a segfault or could silently corrupt data.

The last thing, which I don't have implemented yet, is to handle when a method is changed on the Python class itself.  This is a little more complicated than the last case since we have to update the flag on every single instance of the class.  The solution here, I think, is to use a WeakSet to keep track of the instances.  When a method is changed, we could iterate over the WeakSet and change the flag on every active instance.  We can detect that a virtual method is being changed by using a __setattr__ method on the metaclass.  I haven't tested it yet, but I think that the overhead of adding instances to the set when they're created shouldn't be too large considering the other things that go on in the constructor anyway, such as calling the C++ constructor.

So, the next week or two, I'm going to be continuing to work on the generator.  I think after I put a few finishing touches on the virtual method handling code, I'll work on a few of the easier things like enumerators and global variables.

No comments:

Post a Comment