Unfortunately, I have been unable to completed the entire project. Here's where things stand right now: There are enough of the ETG (tweaker) scripts converted in order for the main module to be imported and for a number of the simple demos to run. That said, a bunch of functionality is missing from the main module and none of the submodules (wx.html, wx.grid, etc.) work. Converting the remainder of the ETG scripts is something that is just going to take more time. Many of them are simple enough, but some of them have a lot of additions that use the C-API and will require a significant amount of time and work.
I've spent the last couple of days working on some
documentation. I've decided to focus on writing documentation for
converting etg scripts rather than for the internals of the generator. Although there are doubtlessly bugs in the generator yet to be fixed (I already know of at least a couple), I think most of the work that needs to be done and that is likely to be done by someone else is working with the ETG scripts.
I don't plan on abandoning this project. I want see this project through to its completion.
I've asked permission to continue working on this project as a part of a
class this semester. I haven't gotten a response just yet, but I'm
hopeful.
Sunday, September 22, 2013
Wednesday, September 11, 2013
Sip annotations
I've finally, finished implementing all of the sip annotations that are currently used by wxPython Phoenix. Barring some unexpected requirement (and doubtless there will be at least one) the generator itself should be complete.
To talk about sip annotations a bit: sip uses what it calls annotations, specified by extra code added to .sip files, to allow the default before of the generated bindings to be changed. Annotations can applied to basically any C++ definition. The ones that are relevant here are annotations for classes, methods, method parameters, and variables.
An example of what an annotated function definition would look like in a .sip file (taken sip's documentation):
Since the existing wxPython code is pretty heavily tailored to sip, the tweaker scripts reference these annotations directly. This also makes sense, I suppose, since there isn't really a good, predefined way to specify the various behaviors of the annotations represent. The problem, however, was that some of the annotations are poorly documented.
The worst example of an annotation being poorly documented is the Transfer annotation. When applied to a function,Transfer indicates that the ownership of the return value is given to C++. That much is pretty clear from the documentation anyway. What is not documented is that when Transfer is applied to a Ctor, sip increments the refcount for the new object an extra time. The effect of this is that the (Python) object will not be garbage collected until either its (hopefully virtual) C++ Dtor is called or the ownership of the object is somehow changed.
For a more mild example, I don't know how many times I read the documentation for the KeepReference code before I figured out how its 'keys' work. (And only after I implemented that functionality did I realize that wxPython never specifies the key to use...)
Moving forward, I now need to modify the tweaker scripts to make them compatible with the new generator. My plan for this is to remove all of the scripts, and add them back one by one. As I add them back, I plan on splitting any sip-specific and cffi-specific code into separate files which can be imported by the shared script depending on the generator being used.
To be perfectly honest, I don't expect to finish modifying all of the scripts before the 16th. For now, I've settled on trying to have most of the core module working by then. I wanted to be at this point about a month ago, but things always seem to take me longer than I plan for them too.
To talk about sip annotations a bit: sip uses what it calls annotations, specified by extra code added to .sip files, to allow the default before of the generated bindings to be changed. Annotations can applied to basically any C++ definition. The ones that are relevant here are annotations for classes, methods, method parameters, and variables.
An example of what an annotated function definition would look like in a .sip file (taken sip's documentation):
void exec(QWidget * /Transfer/) /ReleaseGIL, PyName=call_exec/;
Since the existing wxPython code is pretty heavily tailored to sip, the tweaker scripts reference these annotations directly. This also makes sense, I suppose, since there isn't really a good, predefined way to specify the various behaviors of the annotations represent. The problem, however, was that some of the annotations are poorly documented.
The worst example of an annotation being poorly documented is the Transfer annotation. When applied to a function,Transfer indicates that the ownership of the return value is given to C++. That much is pretty clear from the documentation anyway. What is not documented is that when Transfer is applied to a Ctor, sip increments the refcount for the new object an extra time. The effect of this is that the (Python) object will not be garbage collected until either its (hopefully virtual) C++ Dtor is called or the ownership of the object is somehow changed.
For a more mild example, I don't know how many times I read the documentation for the KeepReference code before I figured out how its 'keys' work. (And only after I implemented that functionality did I realize that wxPython never specifies the key to use...)
Moving forward, I now need to modify the tweaker scripts to make them compatible with the new generator. My plan for this is to remove all of the scripts, and add them back one by one. As I add them back, I plan on splitting any sip-specific and cffi-specific code into separate files which can be imported by the shared script depending on the generator being used.
To be perfectly honest, I don't expect to finish modifying all of the scripts before the 16th. For now, I've settled on trying to have most of the core module working by then. I wanted to be at this point about a month ago, but things always seem to take me longer than I plan for them too.
Thursday, August 22, 2013
Mapped Types
I started working on implementing annotations shortly after making my previous post. I implemented a couple of simple ones first, then started working on the Array annotation. While getting it work for wrapped types, I realized that it would interact rather strangely with mapped types. I decided it would probably save myself some headaches later if I implemented mapped types now rather than later, and so I implemented them (also, I meant to do it early and forgot...)
The idea behind mapped types is that there are some C++ types that the library uses that Python programmers shouldn't need to worry about. The big, obvious example is C++ strings. So, the bindings should silently convert such types to and from Python types. Of course, there is no way to automatically create the code to convert the mapped types; it has to be supplied to the generator. The existing wxPython has hand written .sip files that contain the code. Since the cffi generator doesn't have an intermediary format like the .sip files, it will have cffi-specific tweaker scripts that include the code.
For sip, the actual conversion code can be much simpler than it can for the cffi generator. CPython's api means that one block of code can manipulate both the C++ objects and the Python objects. Its not really possible to do the same thing in the cffi bindings. Code for interacting with the mapped types directly from Python isn't generated because the user won't need to interact with the objects and because no information about the interface of the mapped type is provided to the generator. So instead, the solution I came up with was to split the conversion (in each direction) into two parts: a to C conversion and a from C conversion. The idea is that, using cffi, C data types can be an intermediary between the Python and C++ types. This gives us Python->C and C->Python code that is written in Python and the C++->C and C->C++ code is written in C++. Additionally, the to/from Python code is called only from Python and to/from C++ code is called only from C++ (with one exception,) reducing the total number boundary crossing.
An example of what the conversion code can look like for wxString:
A couple of comments about the above code: C->C++ code should always return a pointer to a heap allocated object so that the same block of code can be used even when the library will expect to take ownership of new object. While using ffi.new in the Python->C code will allow C->C++ to not have to do any freeing of memory, the C->Python code will always have to cleanup after C++->C code.
Of course, there are a few issues with this approach. First of all, there is some overhead associated with allocating objects on the heap and then almost immediately freeing them. But, compared the other alternative I came up with, this one involves fewer boundary-crossing and preforms much better. The second problem arises from virtual methods messing things up a little. In all other circumstances, the Python->C code can use ffi.new to allocate the objects that get passed to the native function, but in a virtual function, the object has to be returned rather than passed as a parameter. This prevents us from doing the C->C++ conversion in the C++ virtual function. By then, the Python-created c data will be out of scope and potentially garbage collected. So in the virtual method handler, calling the C->C++ conversion from Python code is necessary (this is the exception I mentioned.) The last issue, also related to virtual methods, is one that I fear is simply unsolvable. When a virtual method returns a pointer to a mapped type, there is have no way of making sure that the object we allocate in the C->C++ conversion code is ever deleted. I think this is a problem sip has too, though.
So, now that I have mapped types taken care of, I plan to continue working on implementing the various annotations. Expect my next post to be a bunch about them.
The idea behind mapped types is that there are some C++ types that the library uses that Python programmers shouldn't need to worry about. The big, obvious example is C++ strings. So, the bindings should silently convert such types to and from Python types. Of course, there is no way to automatically create the code to convert the mapped types; it has to be supplied to the generator. The existing wxPython has hand written .sip files that contain the code. Since the cffi generator doesn't have an intermediary format like the .sip files, it will have cffi-specific tweaker scripts that include the code.
For sip, the actual conversion code can be much simpler than it can for the cffi generator. CPython's api means that one block of code can manipulate both the C++ objects and the Python objects. Its not really possible to do the same thing in the cffi bindings. Code for interacting with the mapped types directly from Python isn't generated because the user won't need to interact with the objects and because no information about the interface of the mapped type is provided to the generator. So instead, the solution I came up with was to split the conversion (in each direction) into two parts: a to C conversion and a from C conversion. The idea is that, using cffi, C data types can be an intermediary between the Python and C++ types. This gives us Python->C and C->Python code that is written in Python and the C++->C and C->C++ code is written in C++. Additionally, the to/from Python code is called only from Python and to/from C++ code is called only from C++ (with one exception,) reducing the total number boundary crossing.
An example of what the conversion code can look like for wxString:
//Cpp2C //malloc must be used instead of new so that this data can be freed from Python char *cstr = (char*)malloc(cpp_obj->length()); strcpy(cstr, cpp_obj->c_str()); return cstr;
//C2Cpp //We don't have to free the cdata here because it was allocated from ffi.new return new wxString(cdata);
# Py2C cdata = ffi.new('char[]', obj) # Py2C always returns two values: the actual cdata and a keepalive object. The # latter is needed when creating, for example an array of strings return (cdata, None)
# C2Py obj = ffi.string(cdata) # Explicit freeing is necessary here unfortunately clib.free(cdata) return obj
A couple of comments about the above code: C->C++ code should always return a pointer to a heap allocated object so that the same block of code can be used even when the library will expect to take ownership of new object. While using ffi.new in the Python->C code will allow C->C++ to not have to do any freeing of memory, the C->Python code will always have to cleanup after C++->C code.
Of course, there are a few issues with this approach. First of all, there is some overhead associated with allocating objects on the heap and then almost immediately freeing them. But, compared the other alternative I came up with, this one involves fewer boundary-crossing and preforms much better. The second problem arises from virtual methods messing things up a little. In all other circumstances, the Python->C code can use ffi.new to allocate the objects that get passed to the native function, but in a virtual function, the object has to be returned rather than passed as a parameter. This prevents us from doing the C->C++ conversion in the C++ virtual function. By then, the Python-created c data will be out of scope and potentially garbage collected. So in the virtual method handler, calling the C->C++ conversion from Python code is necessary (this is the exception I mentioned.) The last issue, also related to virtual methods, is one that I fear is simply unsolvable. When a virtual method returns a pointer to a mapped type, there is have no way of making sure that the object we allocate in the C->C++ conversion code is ever deleted. I think this is a problem sip has too, though.
So, now that I have mapped types taken care of, I plan to continue working on implementing the various annotations. Expect my next post to be a bunch about them.
Thursday, August 8, 2013
Multimethods
The last week has felt a lot more productive then the previous few. I've added a numberof small things to the generator, including methods with custom C++ code, (working) protected methods, and overloaded methods. Overloaded methods presented kind of an interesting problem. A bit of background: the way I setup my multimethod code, overloads have their types specified in a decorator. So the declaration of a multimethod with one overload looks like:
The problem: the type for each variable has to already exist by the time the overload is created. Now you might be thinking, like I was when I first realized this was going to be a problem, that you probably can sort things so that the declaration for every class comes before its required by some function. But there is one situation that such a solution could never solve: copy constructors. The copy constructor must have its own class as a type for its parameter. So, what I ended up doing was to move the actual overloads of the multimethods to the end of the module, after every class has been created. While this does impair the readability of the code some, I don't think anyone is likely to care much about it in this case.
So anyway, ... As I add more functionality to the generator, I keep finding more things that I'll need to add later. Something that dawned on me fairly recently that I'm slightly dreading is the prospect of adding support for all of the function and parameter sip annotations that wxPython uses. I'll probably write more about this later when I start actually adding support for them, but I'm predicting that dealing with them and the interactions between multiple of them could be a very frustrating experience.
@wrapper_lib.MultiMethod def func(): """A multimethod.""" @func.overload(s=str) def func(s): print s(Side note: I'd like to point out how awesome inspect.getargspec is. It made writing the multimethod support code way easier.)
The problem: the type for each variable has to already exist by the time the overload is created. Now you might be thinking, like I was when I first realized this was going to be a problem, that you probably can sort things so that the declaration for every class comes before its required by some function. But there is one situation that such a solution could never solve: copy constructors. The copy constructor must have its own class as a type for its parameter. So, what I ended up doing was to move the actual overloads of the multimethods to the end of the module, after every class has been created. While this does impair the readability of the code some, I don't think anyone is likely to care much about it in this case.
So anyway, ... As I add more functionality to the generator, I keep finding more things that I'll need to add later. Something that dawned on me fairly recently that I'm slightly dreading is the prospect of adding support for all of the function and parameter sip annotations that wxPython uses. I'll probably write more about this later when I start actually adding support for them, but I'm predicting that dealing with them and the interactions between multiple of them could be a very frustrating experience.
Wednesday, July 31, 2013
Progress update, virtual methods
I've been slowly adding features to the generator over the last week. Its far from complete, but at the moment it works for generating classes and methods/functions. Most recently I've been working on implementing overriding virtual methods with Python methods. The implementation of this is interesting, so I thought I talk about it a little.
The place to start, I suppose, is to explain why this is something that is useful. The idea is that a subclass of a C++ class created in Python should be as useful as a regular C++ subclass. Important to this is when the user overrides a method in the Python subclass, they should be able to expect it to act like they had overridden that method in a C++ subclass. Which is to say that provided the method is virtual, when the method is called on the C++ instance from a base class pointer, the new implementation should be called. An example is wxPython's App.OnInit (aka wxApp::OnInit.) This method is not (typically) called by users themselves, but instead by the library. It is intended to be overridden in a subclass to allow the user to perform some setup before the application starts.
The first step to implementing virtual methods is to generate a subclass of the C++ class we're wrapping. When we create instances of the Python wrapper class, we'll create an instance of this subclass instead of the original C++ class. The reason we want this subclass is so that we can re-implement every virtual method. These re-implementations can then call Python functions via function pointers created using cffi. The Python functions look up the Python wrapper object for the `this` pointer from the virtual method, convert the arguments into Python types, and then call the corresponding method on the Python wrapper. The value returned by the Python callback is then converted as necessary and returned back to virtual method.
That seems simple enough, but we have a major problem: there is a bunch of overhead associated with calling a Python callback from C/C++ and with converting the arguments. In fact, most of time the virtual methods aren't overridden and all of that overhead is for nothing. Since we're already creating a subclass, we can add a set of flags - one for each virtual method - to track whether or not a method has been overridden in Python. A virtual re-implementation can then simply call the base implementation of the method if the corresponding flag has not been set. We can then use a metaclass to figure out which methods have been overridden by a Python subclass and build a set of default flags for instances of that subclass.
That handles the more C++-like use case of a subclass overriding a method from its superclass, but what about something a bit more Pythonic: replacing a method on a single instance. We already have most of what we need in place for this. We just need to update the flag when a Python method corresponding to a virtual method gets changed. The solution that I ended up using was to use a descriptor to wrap the virtual methods. When the descriptor's __set__ method is triggered, it sets the flag on the object, but only if the object was created by Python. The reason for that last check is because we can't guarantee that a object that wasn't created by Python will have the flags field and if it doesn't, trying to set a flag could cause a segfault or could silently corrupt data.
The last thing, which I don't have implemented yet, is to handle when a method is changed on the Python class itself. This is a little more complicated than the last case since we have to update the flag on every single instance of the class. The solution here, I think, is to use a WeakSet to keep track of the instances. When a method is changed, we could iterate over the WeakSet and change the flag on every active instance. We can detect that a virtual method is being changed by using a __setattr__ method on the metaclass. I haven't tested it yet, but I think that the overhead of adding instances to the set when they're created shouldn't be too large considering the other things that go on in the constructor anyway, such as calling the C++ constructor.
So, the next week or two, I'm going to be continuing to work on the generator. I think after I put a few finishing touches on the virtual method handling code, I'll work on a few of the easier things like enumerators and global variables.
The place to start, I suppose, is to explain why this is something that is useful. The idea is that a subclass of a C++ class created in Python should be as useful as a regular C++ subclass. Important to this is when the user overrides a method in the Python subclass, they should be able to expect it to act like they had overridden that method in a C++ subclass. Which is to say that provided the method is virtual, when the method is called on the C++ instance from a base class pointer, the new implementation should be called. An example is wxPython's App.OnInit (aka wxApp::OnInit.) This method is not (typically) called by users themselves, but instead by the library. It is intended to be overridden in a subclass to allow the user to perform some setup before the application starts.
The first step to implementing virtual methods is to generate a subclass of the C++ class we're wrapping. When we create instances of the Python wrapper class, we'll create an instance of this subclass instead of the original C++ class. The reason we want this subclass is so that we can re-implement every virtual method. These re-implementations can then call Python functions via function pointers created using cffi. The Python functions look up the Python wrapper object for the `this` pointer from the virtual method, convert the arguments into Python types, and then call the corresponding method on the Python wrapper. The value returned by the Python callback is then converted as necessary and returned back to virtual method.
That seems simple enough, but we have a major problem: there is a bunch of overhead associated with calling a Python callback from C/C++ and with converting the arguments. In fact, most of time the virtual methods aren't overridden and all of that overhead is for nothing. Since we're already creating a subclass, we can add a set of flags - one for each virtual method - to track whether or not a method has been overridden in Python. A virtual re-implementation can then simply call the base implementation of the method if the corresponding flag has not been set. We can then use a metaclass to figure out which methods have been overridden by a Python subclass and build a set of default flags for instances of that subclass.
That handles the more C++-like use case of a subclass overriding a method from its superclass, but what about something a bit more Pythonic: replacing a method on a single instance. We already have most of what we need in place for this. We just need to update the flag when a Python method corresponding to a virtual method gets changed. The solution that I ended up using was to use a descriptor to wrap the virtual methods. When the descriptor's __set__ method is triggered, it sets the flag on the object, but only if the object was created by Python. The reason for that last check is because we can't guarantee that a object that wasn't created by Python will have the flags field and if it doesn't, trying to set a flag could cause a segfault or could silently corrupt data.
The last thing, which I don't have implemented yet, is to handle when a method is changed on the Python class itself. This is a little more complicated than the last case since we have to update the flag on every single instance of the class. The solution here, I think, is to use a WeakSet to keep track of the instances. When a method is changed, we could iterate over the WeakSet and change the flag on every active instance. We can detect that a virtual method is being changed by using a __setattr__ method on the metaclass. I haven't tested it yet, but I think that the overhead of adding instances to the set when they're created shouldn't be too large considering the other things that go on in the constructor anyway, such as calling the C++ constructor.
So, the next week or two, I'm going to be continuing to work on the generator. I think after I put a few finishing touches on the virtual method handling code, I'll work on a few of the easier things like enumerators and global variables.
Tuesday, July 23, 2013
Progress Update
I've been working on the generator code this week, though I haven't made quite as much progress in the last week as I had hoped. I ended up restarting a few times before I came up with a plan that I liked. The problem was that way the build system is set up, each etg script is run separately, without interacting with one another at all. This means that, for example, the generator handles the wxFrame class, it doesn't have any information about the wxWindow class (from which wxFrame derives.) That's fine for the sip generator since the generators outputs intermediary files which the sip executable processes. It's a problem for the cffi generator because it should output the actual C++ and Python files for the bindings. So the solution I settled on was to pickle the output of the etg scripts, and do the actual generation code in a separate step. By loading the pickles from scripts that are referenced in the one currently being processed, the generator can look up information about classes handled in other scripts or even other modules.
The next couple of weeks are going to be spent working on the generator script, I think. The only real unsolved problem there, I think, is figuring out how to correctly order the definition of classes in the Python code for inheritance and default parameters, among other things. I'll probably write about that in more detail next week when I know what I'll do about it.
The next couple of weeks are going to be spent working on the generator script, I think. The only real unsolved problem there, I think, is figuring out how to correctly order the definition of classes in the Python code for inheritance and default parameters, among other things. I'll probably write about that in more detail next week when I know what I'll do about it.
Friday, July 12, 2013
Makeup of wxPython
This week I've started working on the generator script and I decided that for a bit of fun and to get a better idea of what the library looks like, I'd collect some statistics about the contents of the library:
Total Classes: 596
Total Methods: 8508
Virtual Methods: 1233
Virtual Methods (not Dtor): 1138
Overloaded Methods: 705
Overloaded Methods (not Ctor): 328
Average Number of overloads: 2.3914893617
Average Number of overloads (not Ctor): 2.46951219512
Global Variables: 590
Defines: 2544
A few notes about the numbers:
This data is based on the output of the tweaker scripts, which get their data from wxWidgets documentation. As a result, undocumented classes and methods are not included. I've also intentionally excluded everything that the tweaker scripts mark as 'ignored'. I have not included pure Python classes or methods.
"Methods" in this context includes C++ methods that are added by the tweaker scripts and are not present in wxWidgets (for example wx.Menu.FindItemById.)
Virtual methods are somewhat under reported here. The tweaker scripts turn off the virtual flag on a large number of methods that aren't likely to be overridden from Python. This helps keep the size of bindings down since extra code needs to be generated for virtual methods.
This basically confirms what was I had originally suspected about overloaded methods: the majority of them are constructors. I did expect the average number of overloads to be at least three, though I'm not sure why. The defines count is on the order of magnitude I expected, but also a little high. The number of defines is so high because they include the definitions for the default window ids, the default events, and window styles options, among other things.
I'll probably be working on the generator script for the next couple of weeks. I'll probably write something about that next week, I suppose.
Total Classes: 596
Total Methods: 8508
Virtual Methods: 1233
Virtual Methods (not Dtor): 1138
Overloaded Methods: 705
Overloaded Methods (not Ctor): 328
Average Number of overloads: 2.3914893617
Average Number of overloads (not Ctor): 2.46951219512
Global Variables: 590
Defines: 2544
A few notes about the numbers:
This data is based on the output of the tweaker scripts, which get their data from wxWidgets documentation. As a result, undocumented classes and methods are not included. I've also intentionally excluded everything that the tweaker scripts mark as 'ignored'. I have not included pure Python classes or methods.
"Methods" in this context includes C++ methods that are added by the tweaker scripts and are not present in wxWidgets (for example wx.Menu.FindItemById.)
Virtual methods are somewhat under reported here. The tweaker scripts turn off the virtual flag on a large number of methods that aren't likely to be overridden from Python. This helps keep the size of bindings down since extra code needs to be generated for virtual methods.
This basically confirms what was I had originally suspected about overloaded methods: the majority of them are constructors. I did expect the average number of overloads to be at least three, though I'm not sure why. The defines count is on the order of magnitude I expected, but also a little high. The number of defines is so high because they include the definitions for the default window ids, the default events, and window styles options, among other things.
I'll probably be working on the generator script for the next couple of weeks. I'll probably write something about that next week, I suppose.
Friday, July 5, 2013
Event Handling in wxPython
For the last several days I've been working on getting event handling working, so I figured I could tell you a bit about how event handling is implemented in wxPython.
First of all some background about wxWidgets for those unfamilar. (You can skip this section if you are familar at all with wxWidgets or wxPython)
Event handling in wxWidgets is more or less the same as in other GUI toolkit. Events can be generated by the library, for example when the user clicks on a button or a timer fires, or can be created by the programmer. The actual handling of the events occurs in wxEvtHandler, from which most of the widgets in wx are derive. When a wxEvtHandler instance is created, a parent can be specified. This will allow the parent to handle any events generated by the child (or its children) that it doesn't handle itself. These events are manifested as C++ objects deriving from wxEvent and may hold details about the event that has occurred.
The programmer can specify an action for a wxEvtHandler to take when it encounters an event of a particular type (and optionally from a particular widget.) This is described as connecting the handler to the event. The action for the handler to take is specified in the form of a C++ callback. Originally, the way to connect a handler to events was using compile time "event tables," which are limited to calling methods on the wxEvtHandler object they are defined on. In modern versions of wxWidgets, it is possible to dynamically connect and disconnect a wxEvtHandler to an event at runtime. Additionally, arbitrary methods as well as functions and functors may be used as callbacks for these dynamic connections.
For a better/more detailed explanation see the wxWidgets documentation (or for the C++ averse the wxPython version)
For events to be useful to Python programmers, wxPython must these two things possible: using arbitrary Python callables to handle events and creating new events.
The first feature is actually relatively straight forward for the existing wxPython bindings. A C++ function is used as the callback that is passed to the library, which calls the Python callable. To get the pointer to the Python object to the C++ function, a C++ object that wraps the pointer is given as user data for the event. wxWidgets makes this user data object available to the C++ callback that handles the event and takes ownership of the object, deleting it if/when the event is disconnected.
The second feature is a little more complex. We want events created in Python to retain their Python attributes when they reach their callbacks. (You can see an example of this here) This sounds easy, but wxWidgets internally makes copies of the event objects. When the C++ event object is passed to the callback, it maybe at a different address than the original object, thus making it difficult to find the correct Python object to pass to the Python callable.
wxPython's solution to this is to have a PyEvent object that is Python aware and able to carry it's attributes through C++ unmolested. It does this by storing a pointer to a Python dict object inside the C++ PyEvent object. When the event object gets cloned, the pointer gets cloned as well (and its refcount gets incremented.) The Python PyEvent object defines __{set,get,del}attr__ methods that redirect to the aforementioned dict object. [1] This way, even if the Python PyEvent objects are different and/or wrap different C++ objects, they will have the same attributes as far as the user is concerned.
For the first feature, my solution is pretty much the same with one small twist: PyPy doesn't have a C API that allows a Python object to be called from C++ like CPython does. To cope with this, there are a couple of small changes that need to made. First of all, instead of passing a pointer to the Python callable, we pass an handle created with ffi.new_handle()[2]. Second, we call a constant Python callback from the C++ callback that is connected to the event. In this Python function we lookup the Python callable with ffi.from_handle(), create the event object, and then finally call the Python object.
This is an indication of what has been and I suspect will continue to be a theme in the project: replacing C++ code with equivalent Python code
The second feature is a bit more difficult and I don't quite have a perfect solution right now. What I have in place right now starts out similar to wxPython: we hold a pointer inside the C++ PyEvent object. Where its different is the pointer is in fact a wxSharedPointer and it points to a wrapper around a handle to a Python PyEvent object. The idea here is that the wrapper has a destructor that will call a Python callback to release the reference to the handle so it can be garbage collected. The wxSharedPointer is a refcounted autopointer and makes sure that the wrapper's dtor is only called when every PyEvent object pointing to it has been deleted. The problem with this plan is the handle keeps the Python PyEvent object it represents alive, which in turn keeps the last C++ PyEvent object alive. My thought process was that I wanted to be able to pass the original PyEvent object to the callbacks, bypassing the need to use a separate dictionary and the attribute access methods.
I'm still working out exactly how I'll handle this, but once I have it worked out, the only part of event handling left to work is releasing handles to Python callables once they've been disconnected. I'll probably write another post once I know how that will work.
1. Its useful to note accessing the dictionary like this is possible because PyEvent, like the vast majority of wxPython types, is implemented in C and so can directly access the data members of the C++ objects it wraps.
2. ffi.new_handle() is new in CFFI 0.7, which has yet to be released at the time of writing. See near the bottom of Misc methods on ffi in the CFFI documentation for more information.
wxWidgets Background
First of all some background about wxWidgets for those unfamilar. (You can skip this section if you are familar at all with wxWidgets or wxPython)
Event handling in wxWidgets is more or less the same as in other GUI toolkit. Events can be generated by the library, for example when the user clicks on a button or a timer fires, or can be created by the programmer. The actual handling of the events occurs in wxEvtHandler, from which most of the widgets in wx are derive. When a wxEvtHandler instance is created, a parent can be specified. This will allow the parent to handle any events generated by the child (or its children) that it doesn't handle itself. These events are manifested as C++ objects deriving from wxEvent and may hold details about the event that has occurred.
The programmer can specify an action for a wxEvtHandler to take when it encounters an event of a particular type (and optionally from a particular widget.) This is described as connecting the handler to the event. The action for the handler to take is specified in the form of a C++ callback. Originally, the way to connect a handler to events was using compile time "event tables," which are limited to calling methods on the wxEvtHandler object they are defined on. In modern versions of wxWidgets, it is possible to dynamically connect and disconnect a wxEvtHandler to an event at runtime. Additionally, arbitrary methods as well as functions and functors may be used as callbacks for these dynamic connections.
For a better/more detailed explanation see the wxWidgets documentation (or for the C++ averse the wxPython version)
In wxPython
For events to be useful to Python programmers, wxPython must these two things possible: using arbitrary Python callables to handle events and creating new events.
The first feature is actually relatively straight forward for the existing wxPython bindings. A C++ function is used as the callback that is passed to the library, which calls the Python callable. To get the pointer to the Python object to the C++ function, a C++ object that wraps the pointer is given as user data for the event. wxWidgets makes this user data object available to the C++ callback that handles the event and takes ownership of the object, deleting it if/when the event is disconnected.
The second feature is a little more complex. We want events created in Python to retain their Python attributes when they reach their callbacks. (You can see an example of this here) This sounds easy, but wxWidgets internally makes copies of the event objects. When the C++ event object is passed to the callback, it maybe at a different address than the original object, thus making it difficult to find the correct Python object to pass to the Python callable.
wxPython's solution to this is to have a PyEvent object that is Python aware and able to carry it's attributes through C++ unmolested. It does this by storing a pointer to a Python dict object inside the C++ PyEvent object. When the event object gets cloned, the pointer gets cloned as well (and its refcount gets incremented.) The Python PyEvent object defines __{set,get,del}attr__ methods that redirect to the aforementioned dict object. [1] This way, even if the Python PyEvent objects are different and/or wrap different C++ objects, they will have the same attributes as far as the user is concerned.
Differences for wxPython-cffi
For the first feature, my solution is pretty much the same with one small twist: PyPy doesn't have a C API that allows a Python object to be called from C++ like CPython does. To cope with this, there are a couple of small changes that need to made. First of all, instead of passing a pointer to the Python callable, we pass an handle created with ffi.new_handle()[2]. Second, we call a constant Python callback from the C++ callback that is connected to the event. In this Python function we lookup the Python callable with ffi.from_handle(), create the event object, and then finally call the Python object.
This is an indication of what has been and I suspect will continue to be a theme in the project: replacing C++ code with equivalent Python code
The second feature is a bit more difficult and I don't quite have a perfect solution right now. What I have in place right now starts out similar to wxPython: we hold a pointer inside the C++ PyEvent object. Where its different is the pointer is in fact a wxSharedPointer and it points to a wrapper around a handle to a Python PyEvent object. The idea here is that the wrapper has a destructor that will call a Python callback to release the reference to the handle so it can be garbage collected. The wxSharedPointer is a refcounted autopointer and makes sure that the wrapper's dtor is only called when every PyEvent object pointing to it has been deleted. The problem with this plan is the handle keeps the Python PyEvent object it represents alive, which in turn keeps the last C++ PyEvent object alive. My thought process was that I wanted to be able to pass the original PyEvent object to the callbacks, bypassing the need to use a separate dictionary and the attribute access methods.
I'm still working out exactly how I'll handle this, but once I have it worked out, the only part of event handling left to work is releasing handles to Python callables once they've been disconnected. I'll probably write another post once I know how that will work.
1. Its useful to note accessing the dictionary like this is possible because PyEvent, like the vast majority of wxPython types, is implemented in C and so can directly access the data members of the C++ objects it wraps.
2. ffi.new_handle() is new in CFFI 0.7, which has yet to be released at the time of writing. See near the bottom of Misc methods on ffi in the CFFI documentation for more information.
Wednesday, June 19, 2013
Coding Begins...
Although coding begins officially this week, I've been doing some coding already the last couple of weeks already. I've been writing mockups the generated bindings and working on the library code. The library code what will provide functionality such as mapping C++ pointers to Python objects and overloading methods to the bindings.
Here's a screenshot of the first demo running on PyPy:
The agenda for the next week is to get a second demo working that makes use of Python functions as event handlers. Once have that demo working, I'll start on writing the generator itself.
Here's a screenshot of the first demo running on PyPy:
The agenda for the next week is to get a second demo working that makes use of Python functions as event handlers. Once have that demo working, I'll start on writing the generator itself.
Thursday, June 6, 2013
Introduction
So, this introduction is a little belated, but better late than never, right?
About Me
My name is Tyler Wade. I'm a computer science major at Washburn University in Topeka, Kansas. I will be a senior when classes start again in August and, baring any extraordinary circumstances, will graduate next Spring. As far as I know, I'm the first student from my university to be chosen for GSoC, which is kind of cool.
About the project
Right now there isn't much support among GUI toolkits for PyPy. Using cpyext, TKinter and (classic) wxPython work, but because of the complexity involved in emulating CPython's extension API they are buggy and the performance isn't what it could be. So to improve this situation, this summer I will be extending wxPython Phoenix to create a set of bindings that use CFFI.
wxPython Phoenix
wxPython Phoenix is a work-in-progress rewrite of wxPython that seeks to clean up the API and, more importantly to this project improve maintainability. It achieves this latter goal by partially automating the creation of the bindings via a set of 'etg' scripts. The first set of scripts, called 'extractors,' collect API data from the Doxygen XML files generated from wxWidgets' header files. This data is then processed through a set 'tweaker' scripts to make the API more pythonic. Finally, the API information is fed to a 'generator' script that creates the actual bindings. Currently, the only genreator script in place creates sip files.
CFFI
CFFI is a library for exposing C code to Python. Its similar to ctypes in terms of functionality, but is higher level and has a much nicer API. PyPy has specific optimizations for CFFI and recommends it as the preferred way to call native code from Python.
What I will be doing
I will be creating a generator script that will create the CFFI bindings. Unfortunately, it not as simple as that sounds: more than just Python code needs to be generated. First of all, CFFI is not meant to wrap C++ code directly, so the generator will have to create a C-callable wrapper around the C++ API. Second, some of the functionality that is provided by sip, for example deriving Python types from C++ types, will have to be replicated. Lastly, the tweaker scripts contain a bunch of sip and CPython specific code that will have to be modified too.
You can follow my coding progress throughout the summer at https://bitbucket.org/waedt/wxpython_cffi, though there's nothing there just yet.
Subscribe to:
Posts (Atom)