Time Invariant Virtual Member Function Dispatching For C++ Evolvable Classes Roger Voss Aldus Corporation 411 First Avenue South Seattle, WA 98104-2871 USA Roger.Voss@Aldus.com CompuServe: 73650,616 Abstract Motivated to achieve the technical requirements of object-oriented component- based application architecture, and hoping to migrate substantial existing C++ source code into such future applications, Aldus Corporation has investigated the obstacles that prevent the C++ language from being applied successfully to this programming domain. At stake is this: can the C++ language be enhanced to support the next generation object-oriented application architecture î or will it prove necessary to abandon C++ and completely rewrite new applications in one of the more dynamic object-oriented languages, such as Objective-C, SmallTalk, Eiffel, etc? This paper attempts to address one major C++ obstacle: the need for class method protocol evolvability. 1. Introduction An evolvable C++ base class is one that can be exported from a shared library and derived from or used in other components. Specifically, though, it must be possible to evolve the private implementation and public method protocol of such classes into new generations and then reissue recompiled versions of the shared libraries containing such classes into the field. Prior existing components that are still in use and that are dependent upon any evolved base classes should not be adversely impacted by new shared library releases. Contemporary implementations of C++ are unsuitable to this purpose. It is possible, however, to invigorate the C++ language with a new underlying model of class method dispatch compatible to the requirement of base class evolvability. The scheme presented here for a new C++ method dispatching approach has sufficient flexibility yet still retains traditional C++ efficiency and relatively strong type security. The description to be presented herein of a virtual member function dispatching technique for C++ evolvable classes, is a revisionary supplement to a previous paper titled: êC++ Evolvable Base Classes Residing In Dynamic Linking Librariesë [Voss93]. The monograph here supersedes the portion of that paper which described a dispatching technique based on a cached, linear search look up of method selectors. That approach is described in sections of that paper titled: êMethod Look up Tablesë; êMethod Selector Cachingë; êMethod Invocationë; and êMethod Name Manglingë; those sections are supplanted by the system to be described herein. The term êmethodë will henceforth be used in lieu of the phrase êvirtual member functionë. êMethodsë will be employed as being synonymous as class protocol actions; and êvirtual member functionë will be used so as to mean a class-specific method implementation. The new method dispatching approach to be presented no longer employs cached, linear search look up, but instead uses a pair of index variables of static storage î which when used together directly specify an appropriate entry in a class vtbl (êvirtual tableë). One of these, the method selector index value, is determined at program execution time (as opposed to compile time as with conventional C++ classes). The timing and manner in which this is done can be described as lazy resolution. The index value is calculated when the class method is invoked from some particular location in program code for the first time. All such method invocations for evolvable classes thus incrementally resolve themselves as the application proceeds in execution. Once a given method selector index is resolved it remains so for the rest of the application run time session. Method dispatch time after the first invocation is invariant (except with regard to virtual memory paging delays). Because method indices are resolved in a lazy manner, application startup time is not impeded by having to resolve all method invocation occurrences that exist in the application at one time (or all those that exist in the paged-in application êworking setë). For very large applications this could be an important consideration. It is also worth noting that method invocations that remain unexecuted throughout an application run time session are never resolved at all. Thus a user does not incur any delay penalty for having to do method resolution on unused portions of an application. With lazy resolution you pay for it if and when you use it. Lastly, the instruction code for the method dispatch process does not occur in-line where the method is invoked but instead is centralized in a dispatch function. This permits after-the-fact customization of the method dispatch service so as to better tune an application to imposed memory consumption/execution speed constraints. The fastest method dispatch technique will consume the greatest amount of memory. A less memory consumptive dispatch approach will likely sacrifice invariant method dispatching for some form of dynamic method caching. This choice could be made by a programmer by simply specifying different compiler run time libraries to link to. 2. Unique Class-Method Identification For the purpose of enabling class evolvability, a method will be uniquely distinguished from all other currently existing class methods, and those to yet exist in future base class generations, by a string name representation. (With conventional C++ classes a method is identified by only a compile time constant offset into a class vtbl.) This method string name representation will be very nearly the same as the mangled name generated by a C++ compiler for a virtual member function. (The mangled name is the basis of insuring type safe linkage of C++ programs at static link time.) The name mangling procedure used by the compiler to generate a method name string is similar, though not exactly like what is used to generate the mangled name of a virtual member function. There are two differences: The class name embedded into a mangled method name is always the name of the class of the methodés origin î the class that first introduced the method declaration, also known as the introducer class. Additionally, the string prefix "@meth@" is prepended to the name. A method name will not be able to possibly clash with the mangled names of virtual member functions (which comprise the various class-specific implementations of a method). This distinction will be important for source-level debuggers. At any time that a compiler must generate a reference that reliably identifies a class method, it will use the method string name î as opposed to, say, a vtbl offset constant or a function pointer. When compiling a given source translation unit and encountering a situation where a class method is referenced, the compiler will emit a pointer to a string constant. The string constant is of course the method name in mangled form. For the first occurrence of use, the compiler emits a forward declaration for the method name string constant. When compilation has reached the end of the source translation unit then the compiler will emit a declaration for the string constant that is of static storage. It is an anonymous declaration and does not have any external name visibility. A method name string constant declaration will thus occur no more than once per any source translation unit. 3. Method Invocation There are three static variables needed to support the invocation of an evolvable class method. These three variables can be encapsulated as fields in a structure: struct __MethodSelector { unsigned shortmethodIndex; unsigned shortclassIndex; char * methodName; }; The first field, methodIndex, is used for indexing into a vtbl class subsection during a method dispatch. The second field, classIndex, is used to select the appropriate class subsection indirection slot î which are contained internally in the classés vtbl. The third field, methodName, is used to specify the mangled method name for when the method dispatch is being resolved on the first method invocation occurrence. For a typical 32-bit system this data structure will consume 8 bytes of storage. On CPUs where the preferred word alignment is 4 bytes, care must be taken to ensure that __MethodSelector is a packed data structure or else it might needlessly consume an additional 4 bytes. Of course, in using unsigned short ints, a limit is being imposed here that there can be no more that 65536 classes inclusive in an evolvable classés inheritance hierarchy nor more that a total of 65535 methods introduced by a class. (The zero method index entry is always reserved for use by the method dispatching system.) For each different method invocation that is encountered in a source translation unit, the compiler synthesizes a forward declaration to a variable of type __MethodSelector. Here is an example forward declaration on behalf of the first encountered method invocation: extern __MethodSelector __methSel_1; When compilation reaches the end of a source translation unit, then an anonymous static variable of type __MethodSelector is synthesized for each different prior synthesized forward declaration, like so: static __MethodSelector __methSel_1 = { 0, 2, __methNameStr_1 }; Thus for each different method that is invoked in a source translation unit there is a corresponding anonymous static variable of type __MethodSelector. This data structure will be referred to as a vcall descriptor. Its methodIndex field is initialized to zero and its methodName field is initialized with a pointer to the appropriate mangled method name string constant. The classIndex field is initialized with a compile time determined constant. In Listing 1 is an entire example sequence of synthesized declarations generated for handling method invocation occurrences in some hypothetical source translation unit. 4. vtbl Layout and Class IDs 4-1. Multiple inheritance complexity The complicated aspect of the C++ virtual function dispatch î whether for conventional classes or evolvable classes î is the issue of accommodating multiple // a synthesized declaration introduced prior to // the beginning of each source translation unit struct __MethodSelector { unsigned shortmethodIndex; unsigned shortclassIndex; char * methodName; }; /*** beginning of actual .cpp source translation unit ***/ Menu::Menu(VString* itemName,MenuID id,BOOL largeMenu,BOOL isPopUp) : MenuItem(itemName,MenuItemID(id)) { // synthesized forward declarations for method invocations // Collection::Find() extern __MethodSelector __methSel_1; // Dictionary::AddKeyValuePair() extern __MethodSelector __methSel_2; fHMenu = HMENU(0); fLongMenu = largeMenu; fIsPopUp = isPopUp; fMenuItemID = (id == kNoResourceMenuID) ? ++gNextCreatedMenuID : id; fMenuItems = (LList*)NULL; if (gKnownMenus == NULL) gKnownMenus = new Dictionary; if (gKnownMenus != NULL) { DWORDCItem* theKey = new DWORDCItem(fMenuItemID); if (!gKnownMenus->Find(theKey)) gKnownMenus->AddKeyValuePair(theKey,this); } } /*** ending of actual source translation unit ***/ // synthesized anonymous static variables extern char __methNameStr_1[]; extern char __methNameStr_2[]; static __MethodSelector __methSel_1 = { 0, 3, __methNameStr_1 }; static __MethodSelector __methSel_2 = { 0, 5, __methNameStr_2 }; static char __methNameStr_1[] = "@meth@Collection@Find$qn5CItem"; static char __methNameStr_2[] = "@meth@Dictionary@AddKeyValuePair$qn5CItemt1"; Listing 1 inheritance. In the case of a multiple inheritance class hierarchy an object map may typically have more than one embedded vptr (êvirtual table pointerë) that points to some sub portion of the overall class vtbl. The complexity of this problem is exacerbated for evolvable classes on account the exact location of a methodés virtual function pointer in a vtbl is not established until program execution time. Due to differing derivation paths through multiple inheritance hierarchy graphs, it is necessary for a compiler to sometimes simply reintroduce a method into a derived class î instead of continue to associate the actual introducer base class with the method. Such a reintoduced method is treated as though it is introduced for the first time in the derived class and in effect creates a new method protocol in the derived class î which is semantically the same as the one in the original introducer base class. This will be done to establish different coherent vptr views into the vtbl (where each view corresponds to a different object map derivation hierarchy subsection). These different vptr views of the vtbl exist so as to enable potential upward casting of a class pointer to one of its base classes. Upward casting a multiple inheritance class pointer to a base class may cause it to then use a different relative location for a method protocol in the vtbl than some other possible cast. There is also the more straightforward and more typical case, though, where some single vptr view into the vtbl will be used mutually for several potential upward casts. In this case an inherited method protocol must be assumed to have the same relative location in the vtbl for all relevant potential upward cast. This must hold true, too, for where the base classes are used standalone or are derived from in other inheritance hierarchies. In all these possible scenarios a method protocol must make use of the same method index value (which translates to êsame relative placement in the vtblë). Why? Because the vcall descriptor associated with a method invocation can potentially be used in conjunction to all manner of different vptr/vtbl/derivation-hierarchy combinations. The possibility of a class appearing in different derivation hierarchies and the potentiality of upward casting make that quite clear. An abstraction will be introduced for the case of evolvable classes that is intended to help grapple with the complexities of multiple inheritance. For evolvable classes, one might view the vtbl organization as dynamic information î as its complexion is established at program execution time. However, it is also possible to cleave an evolvable class vtbl into two conceptual parts: one part would be compile-time static known information; the other part would be the remaining dynamic run time information. The ultimate intent in doing this is that it will then be possible to deal with multiple inheritance for evolvable classes using the same solution approaches as is done with conventional classes. 4-2. Two part vtbl Taking this line of thought, we proceed to split the vtbl into two distinct conceptual halves. To describe the first half, imagine taking the vtbl of a conventional C++ class, which has several base classes in its inheritance graph and even has multiple inheritance present (encompassing, say, the vexatious but inevitable diamond inheritance graph case), and squeezing each class down to where it only introduces one method. This is a vtbl layout where every class present in the inheritance graph just introduces a single method. We now have a simple monolithic array of function pointers where each element in this array corresponds one-to-one to an individual class (or node) in the inheritance graph. Now instead of containing pointers to virtual member functions, this monolithic array will instead contain pointers to vtbl class subsections. A class subsection is where virtual member function pointers will actually be located. A class subsection thus represents the protocol of methods introduced by its owning class. (The concatenation of all the class subsections associated with some inheritance graph into a contiguous block of memory comprises the second conceptual half of the vtbl î more on it later.) We now have a vtbl arrangement as portrayed in the schematic of Figure 1. The simple monolithic array that makes up the first conceptual half of the vtbl can be determined at compile-time and is thus statically known information. In fact it could be given static storage and perhaps simply tacked onto an evolvable classés metaclass object î which also is of static storage. The complication of evolvable classes is that the exact size of class subsections cannot be known at compile time. The number of entries in a class subsection can only be determined at run time by a query to the classés metaclass object. This fact does not adversely impact the monolithic array, or static half of the vtbl, that was just described. Regardless of how large individual class subsections end up actually being at run time, the first conceptual half of the vtbl will remain static in regard to its assumed complexion (or layout). Well, perhaps that is a quasi truth. As evolvable base classes are derived from and new classes declared, the vtbls of these new classes will have their inherited static half accrue a new entry. This will come about through the process of recompiling shareable libraries that export these base classes and then reissuing the libraries (where they now additionally export new derived classes). Or through creating new shareable libraries or application modules that host the new derived classes. The net result is that it is fair to say that the static portion of a vtbl only evolves by being appended to î a situation no more perplexing than conventional single inheritance class hierarchies. The vptr(s) that are placed in an object map during construction, and that point into a vtbl, will point only into the first half, or static portion, of the vtbl. A vptr will never point directly into the second half of the vtbl (the dynamic portion). An evolvable class vptr thus points to an array of class subsection pointers. In the Figure 1 case of where a multiple inheritance object map possesses several vptrs, each vptr will point to a windowed view of the overall monolithic array. Each vptr enjoys a view into the monolithic array that accommodates the upward cast (or subset hierarchy of base classes) that a given vptr represents. 4-3. Class IDs To dereference some virtual member function in order to complete a method invocation, it is first necessary to dereference the appropriate class subsection pointer. The class ID index value facilitates this. The class ID index is the value of the classIndex field contained in the vcall descriptor (recall that a unique vcall desciptor is associated with each and every method invocation). The method index contained in the vcall descriptor can then be used to dereference into the class subsection so that the appropriate virtual function pointer is selected. A method index value is class subsection relative. This property is why a given vcall descriptor can be used in combination to vptrs of differering origins. Class IDs are thus the statically known indices of class subsection pointers, which are in turn elements of the monolithic array comprising the vtbl static half. The process of establishing class IDs occurs incrementally during compilation as declarations of classes comprising an inheritance graph are encountered. Therefore, when a class method call is encountered during compilation, the relevant class ID will be known by the compiler and the associated classIndex field can be initialized with the appropriate compile time computed constant. (Keep in mind that the method index is initialized to zero at compile time and will be determined at program execution time.) It is possible that a class that appears in different derivation hierarchies will, as a consequence, be known by different class ID values. That is okay, though, on account vptr windowing will set up relative views into vtbls that amend for this. A given vptr value will set the base position within the overall monolithic array that a class ID index value will be applied against. Object maps containing multiple vptrs are simply some of the joys of C++ multiple inheritance. Class ID indirection and vptr windowing combine to make the situation tractable for evolvable classes. 4-4. The vtbl dynamic half Much has by now been said of the static half of the vtbl. Discussion now turns to the dynamic half of the vtbl. The dynamic half of the vtbl is established at run time by the metaclass object of the owning class. The metaclass object is of static storage and has a static constructor. The metaclass constructor will be responsible for determining the size of the dynamic part of the vtbl, allocating a block of memory for it, and performing necessary initialization. The dynamic half of the vtbl is the concatenation of all class subsections into a single block. The metaclass constructor determines the size of this block by querying each base class metaclass object for the number of methods its class introduces. The count of methods per subsection is incremented by one to allow for the zero index position of the subsection, which is reserved (recall that vcall descriptor method index values start out as zero and will hit this slot position when first executed). The sum of all class subsection counts multiplied by the size of a code pointer yields the required size in bytes of the overall aggregate block. The metaclass constructor, after allocating this block from the free store, will initialize each pointer slot with a pointer to a special method dispatch binding routine. The metaclass constructor will also need to initialize the static half of the vtbl by placing a pointer to the appropriate class subsection into each entry in the monolithic array. 4-5. Retrieving the metaclass pointer One last detail about the vtbl is that the monolithic array comprising the static half actually has a dummy entry as its first entry. This entry does not contain a pointer to any class subsection but instead holds a pointer to the classés metaclass object. Any vptrs into the vtbl (such as the êbase vptrë) will be adjusted to point past this dummy entry. Refer again to the vtbl schematic in Figure 1. It will be possible to retrieve the metaclass pointer by subtracting the size of a data pointer from the base vptr address and dereferencing the resulting pointer-to-a-metaclass-pointer address. 4-6. Applying a delta to this In section 10.8c of The Annotated C++ Reference Manual [Ellis90] titled, êMultiple Inheritance and Virtual Functionsë, there is a discussion of two techniques for applying a delta against the this pointer for when a virtual function î which was introduced in some multiple inherited base class î is being invoked. (The this pointer must sometimes be adjusted to point into the object map at a subsection that will resemble an object map layout that the base class expects.) The first technique of adjusting the this pointer suggest implanting a delta value next to the relevant virtual function pointer(s) within the vtbl: struct __vtbl_entry { void (*fct)(); int delta; }; However, the method dispatching proposal being described here will favor the second recommendation for applying delta to this. In this second approach, a pointer to a thunk is placed into a normal vtbl entry (in the context of this proposal it would be a class subsection entry). The thunk is a small piece of code that applies the necessary delta adjustment on the this pointer and then proceeds to invoke the associated virtual function. Using this technique the vtbl entries are always exactly the same under all conditions î they are just simple pointers to code: struct __vtbl_entry { void (*fct)(); }; This technique is favored on account keeping the vtbl class subsection entries as just simple pointers to code under all circumstances greatly simplifies the overall implementation. This is much more true for this evolvable class method dispatching proposal than is the case with just conventional C++ classes. 5. Method Call Code Generation and Conventions (Refer to Appendix A for an alternative approach to method dispatch code generation that is conducive to CFront C++ implementations.) For an Intel 80386 CPU, the compiler would generate code for evolvable method invocation as in the examples below for the Collection::Find() and the Dictionary::AddKeyValuePair() method call occurrences taken from the source code example in Listing 1: ; ; if (!gKnownMenus->Find(theKey)) ; 001F FF 75 FC PUSH DWORD PTR [EBP-4] ; theKey argument 0022 8B 1D 00000000r MOV EBX,DWORD PTR FLAT:@Menu@gKnownMenus 0028 53 PUSH EBX ; this object pointer 0029 8B 53 04 MOV EDX,DWORD PTR [EBX+4] ; vptr (windowed) 002C 8B 4B 04 MOV ECX,DWORD PTR [EBX+4] ; base vptr (non-windowed) 002F B8 0000000Cr MOV EAX,offset FLAT:___methSel_1 ; ptr to __methSel_1 0034 E8 00000000e CALL NEAR PTR __DispatchMethod@ ; call to dispatcher 0039 83 C4 08 ADD ESP,8 003C 0F B7 C0 MOVZX EAX,AX 003F 0B C0 OR EAX,EAX 0041 75 1C JNE short @18 ; ; gKnownMenus->AddKeyValuePair(theKey,this); ; 0043 56 PUSH ESI ; this argument 0044 57 PUSH EDI ; theKey argument 0045 8B 1D 00000000r MOV EBX,DWORD PTR FLAT:@Menu@gKnownMenus 004B 53 PUSH EBX ; this object pointer 004C 8B 53 04 MOV EDX,DWORD PTR [EBX+4] ; vptr (windowed) 004F 8B 4B 04 MOV ECX,DWORD PTR [EBX+4] ; base vptr (non-windowed) 0052 B8 0000000Cr MOV EAX,offset FLAT:___methSel_2 ; ptr to __methSel_2 0057 E8 00000000e CALL NEAR PTR __DispatchMethod@ ; call to dispatcher 005C 83 C4 0C ADD ESP,12 005F @18: Listing 2 The arguments to a method are first pushed onto the stack; the object this pointer is pushed; the class vptr is moved into the EDX register; the base vptr is moved into the ECX register; a pointer to the associated vcall descriptor static variable is moved into the EAX register; and then a call is made to the __DispatchMethod@ compiler helper function. The __DispatchMethod@ compiler helper function is a generic dispatch service for evolvable class method calls. For the 80386 CPU, the register trio, EDX:ECX:EAX, is used to pass arguments to the method dispatch service. Passing these dispatch arguments in registers is preferable to pushing them onto the stack in this case. The various pointers that will be dereferenced during the dispatch process will already be laying in registers ready for use; access to stack memory is minimized; and most importantly: there is no need to readjust the stack in any manner before completing the method call dispatch. If these arguments were pushed onto the stack, they would have to be removed from the stack before jumping into a virtual member function body (which would not expect them to be on the stack). Doing such would mean having to temporarily pop off the return address of the function call and then pushing it back on. This all amounts to a lot of hand waving that can be easily dispensed with by simply using EDX:ECX:EAX registers to pass arguments to the dispatch routine. The in-line code generation size of an evolvable class method call is 11 to 13 bytes greater than would be the case for a non-virtual member function call for a conventional C++ class. The method dispatching routine, __DispatchMethod@, (implementation in Listing 3 below) represents another 17 bytes of code î its overhead only occurs once in a program module. Removing the dispatch code from in-line results in a net savings of 12 bytes per method call. For large applications with hundreds or thousands of method calls this could result in a significant reduction of overall application code size. Perhaps even more importantly, a non-in- line dispatch service permits the compiler implementer to supply several different kinds of method dispatch; it could even permit the compiler user to create and link to a customized method dispatcher. ; Parameters passed in registers on entry: ; ; EDX : vptr (windowed vptr) ; ECX : base vptr (non-windowed vptr) ; EAX : ptr to vcall descriptor ; ; EDX:ECX:EAX registers are preserved on function exit ; 0000 __DispatchMethod@ proc near 0000 51 PUSH ECX ; preserve ECX 0001 0F B7 48 02 MOVZX ECX,WORD PTR [EAX+2] ; move class index into ECX 0005 8B 1C 8A MOV EBX,DWORD PTR [EDX][ECX*4] ; get vtbl class subsection offset 0008 0F B7 08 MOVZX ECX,WORD PTR [EAX] ; move method index into ECX 000B 8B 1C 8B MOV EBX,DWORD PTR [EBX][ECX*4] ; place virtual function ptr in EBX 000E 59 POP ECX 000F FF E3 JMP EBX ; jump to the virtual function 0011 __DispatchMethod@ endp Listing 3 The flexibility of method dispatching being encapsulated in a function is nice because it permits application developers more choice as to the tradeoffs of memory consumption versus execution speed. For maximum speed, full size v- tables could be allocated î permitting direct indexing and thus time invariant method dispatching. (The C++ vtbl approach is sometimes referred to as êstatic cachingë on account every existing virtual member function has a function pointer residing in a vtbl entry ready to be invoked.) For very large programs it may be preferable to use some form of global method selector caching (where the number of entries in the global cache is something less than the total number of virtual member functions) coupled with method look-up on cache miss [Andr‚92]. A dynamic scheme of either global or local caching would reduce the memory requirement for cache tables. There is also the possibility of method dispatchers that offer various kinds of debugging assistance. Breakpointing of method calls, Eiffel-style preconditions and postconditions on objects [Meyer92], and dispatching-related diagnostic messages could be centralized in a special debugging version of __DispatchMethod@. By making the method dispatcher a non-in-line function, it is only necessary to link an executable module to a different version of the compilerés method dispatch run time service in order to employ a different dispatch strategy. 6. Lazy Resolution The most interesting novelty of the method dispatching approach described herein î other than compatibility to the goal of class protocol evolvability î is the manner in which it permits lazy resolution of a method invocation on its first occurrence. Lazy resolution is made possible by two conventions: 1) the first method slot entry (the zero offset entry in a vtbl class subsection) is reserved for a special function pointer and 2) a pointer to the __MethodSelector static structure variable, or vcall descriptor, is made available in a register (the EAX register on the Intel 80386 CPU) to the method dispatch service. 6-1. Method Dispatch Blow-By-Blow The methodIndex field of the vcall descriptor is initialized to zero at compile time. When the method call that it is associated with is first executed, the method dispatch service will dereference to and invoke the zero-entry function pointer located in the vtbl class subsection. By convention the first entry of a class subsection is reserved. A function pointer to a method binding routine called, __LookupMethod_glue@, is placed in this entry. Actually all class subsection slots are at first set to contain the __LookupMethod_glue@ function pointer. This is done so that a virgin vtbl can be used in conjunction to an already resolved vcall descriptor, i.e., a vcall descriptor that has a methodIndex field that has been resolved to a non zero value. Resetting a vtbl to have all slots contain the __LookupMethod_glue@ function pointer would cause all the virtual function pointers to be reestablished in the vtbl. Such a method code êhot updatingë feature is an exceptionally intriguing capability. The __LookupMethod_glue@ function gets invoked with the same contents in the EDX:ECX:EAX register trio as does the __DispatchMethod@ function (the __DispatchMethod@ function preserves the EDX:ECX:EAX register trio). The __LookupMethod_glue@ function is essentially an assembly language glue routine (refer to its implementation in Listing 4 below) that itself in turn calls the C++ implemented function, __LookupMethod(). The __LookupMethod_glue@ routine, by being implemented in assembly language, is able to access the arguments passed to it in the EDX:ECX:EAX register trio. ; Parameters passed in registers on entry: ; ; EDX : vptr (windowed vptr) ; ECX : base vptr (non-windowed vptr) ; EAX : ptr to vcall descriptor ; ; EDX:EAX registers are preserved on function exit ; 0011 __LookupMethod_glue@ proc near 0011 52 PUSH EDX ; save vptr 0012 50 PUSH EAX ; save vcall descriptor ptr ; push arguments to function call: 0013 83 E9 04 SUB ECX,4 ; (deref base vptr by -4 offset) 0016 FF 31 PUSH DWORD PTR [ECX] ; metaclass ptr 0018 50 PUSH EAX ; ptr to vcall descriptor 0019 52 PUSH EDX ; vptr 001A E8 00000000e CALL NEAR PTR ___LookupMethod 001F 83 C4 0C ADD ESP,12 ; cut back function call args 0022 58 POP EAX ; get vcall descriptor ptr 0023 5A POP EDX ; get vptr 0024 EB DA JMP __DispatchMethod@ ; complete the method call dispatch 0026 __LookupMethod_glue@ endp Listing 4 The __LookupMethod_glue@ function dereferences the base vptr in ECX in order to retrieve the metaclass object pointer î which is then pushed onto the stack. Then the EDX and EAX register values are also pushed onto the stack. All three arguments have now become conventional C++ function stack arguments. The __LookupMethod_glue@ function then invokes the __LookupMethod() C++ function. Here is the C++ source code prototype of __LookupMethod(): extern "C" void __LookupMethod(__vtbl_entry **windowed_vptr, __MethodSelector *pMethSel, __MetaClass_Super *pMetaClass); Now this function î implemented conveniently in a high level language î does the necessary look up (the metaclass object is called on to perform the method look up operation) of the method being invoked and as a side effect assigns a valid vtbl method index value into the vcall descriptor static structure. The method call can now be dispatched to completion. As a final step __LookupMethod_glue@ calls the __DispatchMethod@ function to complete the method invocation (with the EDX:EAX register pair holding the same values as before î the original ECX argument is now superfluous). This time the method call will dispatch directly to the appropriate virtual member function. To reiterate, the __LookupMethod() function, via the pointer to a vcall descriptor static structure variable it is passed as an argument, sets the methodIndex field of this structure to a vtbl class subsection index value î it is the side effect of calling this function. It also puts a virtual member function pointer into the vtbl (if it is not already in the vtbl due to some other previous method call that resulted in invoking the same virtual member function). Consequently, the next time the method call is executed, the method dispatching service will directly invoke the virtual member function pointer that is definitely now in the vtbl and that is dereferenced by the methodIndex and classIndex values. 6-2. Method dispatch recap The first time that a method call is executed it must be resolved. All subsequent executions of the method call are invariant in execution time (of course CPU instruction caching and virtual memory paging can cause wide variation in any application instruction sequence execution time). A method call, once resolved, remains resolved for the remaining duration of application execution (although the possibility exist for metaclass objects to reset vtbls to initial conditions and thus provide a method code êhot updatingë capability). Method calls that exist in application code, but that are never executed in a particular run time session, are never resolved. The cost of method call resolution is paid only when, or if, the call is first exercised. Effectively, the execution time overhead of method call resolution is spread throughout an applicationés execution progress. This behavior should lessen potential negative impact upon application start-up time stemming from C++ evolvable class implementation support. 7. Undispatchable Method Calls The down side of lazy resolution is that it is possible to attempt a method call that subsequently cannot be resolved and thus obstructs completing the method dispatch. The __LookupMethod() function will have to throw an exception to the application in the event that this occurs. To prevent such an undesirable scenario arising, class programmers should never remove virtual member function declarations when evolving classes into new generations. Obsolete methods should remain present in new class generations in order to provide backward compatibility to old program code that may still be êin the fieldë and still in use along side new program code. This rule can only be enforced through programmer convention under the lazy resolution approach. Perhaps in the future C++ development systems and persistent databases of application class objects (where class hierarchy and metaclass information exist in the database as schema information) can become wedded together. The act of compiling a new generation of a class might result in it being verified against existing database class schema information. Doing so could catch situations where a programmer has inadvertently removed a class method and the compiler could issue a diagnostic message. The AT&T Grail C++ development environment [Murray92] [Koenig92] would appear to be headed in a technology direction that may ultimately permit this manner of class evolvability safety verification once that system is fully wedded to database technology. 7-1. Weighing some alternatives Conceivably some other approach might be to attempt resolution of all application method calls at application load time. If any method call was determined to be unresolvable, then the application launch could be aborted. This would tend to minimize the consequences faced by a user in such an event (and certainly give the application programmer less to have to hassle with). Though the user would be left with the mystery of trying to figure out which application component has a C++ class generation that is no longer in compliance with its prior class generations. This mystery could be clarified by a sufficiently helpful error message diagnostic. (However, the same could be said of the lazy resolution approach î the difference being a matter of chronology as to when the diagnostic message is encountered.) The obvious penalty here, though, is of having to incur the total accumulative cost of all method call resolution at one time. For very large object-oriented applications this could perhaps lead to major delays in initiating application execution. Too, it would require resolving every method call whether it is actually later executed or not î you would pay for it even if you never actually used it (for significant applications we are talking about a matter of hundreds or thousands of method calls). One other slight advantage of the all up-front approach is that after method call resolution is completed, any method symbol table data structures that were used during the resolving process can be freed. An incremental approach, such as lazy resolution, must retain such data structures for the duration of the programés execution. An approach that resolves method calls when individual pages (or segments) of application code are loaded on demand presents no real advantage over the advocated lazy resolution technique either. In fact this approach bears much closer kinship with lazy resolution than the brute force, all up-front approach mentioned in the previous two paragraphs. It has the additional requirement, though, that the operating system provide a feature that enables applications to hook the process of loading pages (or segments) of their code into memory. (Lazy resolution is not dependent on any special operating system feature so is thus highly portable.) Too, this approach could likewise possibly encounter an unresolvable method call when well into the applicationés execution progress î the application here too would have to be thrown an exception resulting from an undispatchable method call. Given the use of an incremental method-call-resolution approach, the problem of undispatchable method calls can be viewed in much the same vein as divide by zero exceptions that possibly can result from an applicationés numeric calculations. Programmers attempt to build fire walls for data input screening so as to prevent such exceptions arising. None-the-less, prudent programmers may also enable their applications to gracefully recover in the event a divide by zero exception still manages to happen. (This is particularly wise for very large complex applications that may have a wide range of numeric calculations appearing in diverse areas of the program.) Hopefully programmers will exert due diligence when evolving their classes into new generations. And hopefully they will imbue their application with graceful recovery (or even graceful shutdown) in the event an undispatchable method call exception is thrown. As always, the market place will no doubt sift out those applications that are lackadaisical in this regard. 8. Method Look Up In Finer Detail Let us return now to the discussion of how method call dispatching takes place. We left off in the dispatch explanation where the C++ function __LookupMethod() is invoked by an unresolved method call. There are several steps that the __LookupMethod() function will have to go through in resolving a method call: ú The method will be searched for via the metaclass method look up service (described in the paper preceding this one [Voss93] on subject of C++ evolvable classes). If the method cannot be found in the classés derivation hierarchy then the application is thrown an undispatchable method call exception. Otherwise, the found virtual member function pointer is copied into and retained in a local variable for use at a later stage. ú The class name component, after extraction from the mangled method name string, must be hashed into an application-global class name symbol table. (This symbol table is managed privately by the method call dispatching service.) All metaclass objects will have hashed their classés class name strings into the class symbol table when they are constructed (at application start up time î fortunately the number of classes is much fewer than the number of methods or number of method calls). Consequently hashing a class name upon a method call will result in collision. The symbol table entry for a class will be a pointer to a classés method name registration table. (Refer to the schematic of the class name symbol table in Figure 2.) Figure 2 Figure 3 ú The mangled method name must now have a hash value computed for it and this hash value is searched for in the method name registration table. If a method-name-hash-token collision occurs, then that entry is further compared to see if it is the same as the mangled method name string. If it exactly matches then this index position in the registration table is the index value that will be copied to the methodIndex field of the vcall decsriptor static structure. (The memory layout of a typical method registration table is illustrated in Figure 3.) ú If an exact match is not found in the search then the first available zero entry in the registration table will be taken. The hash value of the method name and a pointer to the method name string are copied into this entry î making it a non-zero entry. The classés metaclass object knows the number of entries that should be permissible in the method registration table. Thus it dynamically allocates this table and places a pointer to it in the class symbol table entry that the class name string hashes to. ú The class ID index value obtained from the classIndex field of the vcall descriptor is now used in conjunction to the vptr to dereference into the vtbl and obtain a pointer to the class subsection. (Refer to figure 1 to see the layout schematic for a typical evolvable class vtbl.) The virtual member function pointers that implement the method protocol of the class will be incrementally placed into its vtbl class subsection. A classés vtbl subsection has enough function pointer slots to account for every method protocol that is introduced by the class (this number is known by a classés metaclass object and a metaclass object can be queried to obtain it î refer to the evolvable class paper [Voss93] preceding this one for more detail on this). ú The virtual member function pointer that was previously obtained and saved away must now be copied into its slot in the vtbl class subsection. The class ID index and method index values can now be used to dereference the appropriate vtbl slot in order to do this. Conceivably it may already be in the vtbl but to blindly copy it in again will do no harm. (To êblindly copyë would be mandatory if metaclasses support resetting the vtbl to enable a method code êhot updateë feature.) ú Finally, __LookupMethod() can exit so that the method call dispatch can be completed. 9. Run Time Type Identification One of the most recent and talked about C++ extension proposals is that of run time type identification (RTTI) [Stroustrup92]. The authorés attention and efforts toward C++ evolvable classes subject were initiated at the beginning of 1992. The authorés development of ideas and subsequent papers on evolvable classes proceeded independent of efforts underway then toward investigating RTTI for C++. However, the author now concedes the possibility of synergy in the underlying implementation of RTTI and evolvable class support. Likely synergy of these two was even alluded to at some extent in the preceding evolvable class paper [Voss93]. That paper also expressed the opinion that RTTI would be highly desirable in conjunction to evolvable classes where these language capabilities are applied in concert in programming the Object-Oriented Component Application Architecture (OOCAA)1. None-the-less, the proposal for C++ method dispatching presented herein has been framed in the context of assuming RTTI support to not be present. Yet due to the dynamic nature of evolvable classes the implementation must rely upon compiler generated metaclass information. RTTI leads to another variation of requiring compiler generated metaclass (or reflective) information. If one is augmenting a compiler that already supports RTTI to also support evolvable classes, then it would be logical to merge and/or share the metaclass requirements of the two. The author also concedes that the details of this proposal might stand need for revision if indeed RTTI support is already present in the particular C++ compiler being augmented. The validity of much of this proposal should probably hold up î such as the vcall descriptor mechanism and the concept of lazy resolution. The specific device of the class ID index that appears in the vcall descriptor may come under scrutiny, though. 10. Conclusion A technique for implementing a time invariant virtual member function dispatch for C++ evolvable classes has been presented. The focus was to enable evolvable classes with an invariant dispatch that is highly comparable in efficiency to the dispatch of virtual functions for conventional classes. Economy in memory space, execution time, and compiler implementation complexity were of the utmost importance in attempting to arrive at this proposal. Concepts such as lazy resolution of method calls and non-in-lining of the method dispatch service have been adopted with the intent of striking a fair balance between all these concerns. At the same time it was desired to introduce a certain amount of new freedom in C++ program configuration. Base class evolvability is certainly one of the most powerful expressions of new C++ flexibility. However, making the evolvable class method dispatcher a function callable service opens up additional degrees of freedom in: program performance tuning; method call debugging, message traffic monitoring, object state validation; and other feature extensions such as a class pose as capability as found in Objective C [Pinson91]. The author is also aware of existing C++ system designs that could capitalize on the method dispatching/binding technique described here to better implement object- oriented data base concepts. Currently attention has been focused on placing object data into data base management. The next step will be to implement C++ compilers that compile C++ method code into a data base retrievable file structure. The methods of a class object could then begin to be treated as data as well. Hot update to class protocol could be performed by loading new method implementations into the data base (where the data base identity of replaced method(s) remains the same), and then cause an applicationés metaclass objects to reset vtbl entries for these methods. The application would then immediately begin executing with the new class behavior supplied by reinstalled methods. Using the method-as-data, hot updating technique, it is also possible to imagine an application that configures itself by retrieving class objects (and their methods) from a network topology of object servers [Smith90]. Such an application would self- configure depending upon what a user has expressed a desire or intent to accomplish and what object personalities are at the applicationés disposal. Again, the dispatching scheme that has been presented would endow sufficient flexibility to undertake such an architecture vision and at the same time couple it to a rather efficient C++ run time. Acknowledgements I would like to acknowledge Peter Kukol and Tanj Bennett of Borland for their valuable critique and input toward my efforts of describing an evolvable class implementation for the C++ language. They have donated much valuable time and effort in reviewing my ideas, digesting my exhaustive email on the subject, and preparing their own proposal documents relating to this important C++ language issue. I would also like to cite credit to Peter Kukol for coining the term vcall descriptor which I have borrowed for use in this document. Peter Kukol also suggested the CFront compatible dispatch approach detailed in Appendix A. Lastly, I must also express my appreciation to my employer, Aldus Corporation, for recognizing the importance of this issue to the computer software development community at large and supporting my efforts toward popularizing the concern of C++ evolvable classes. References [Voss93] Roger D. Voss êC++ Evolvable Base Classes Residing In Dynamic Link Libraries,ë To appear in C++ Journal, Vol. 2, No. 4 (1993). [Ellis90] Margaret A. Ellis, Bjarne Stroustrup, The Annotated C++ Reference Manual, Addison-Wesley, Reading, MA, 1990. [Andr‚92] Pascal Andr‚, Jean-Claude Royer, êOptimizing Method Search with Lookup Caches and Incremental Coloring,ë Conference Proceedings of OOPSLA è92, October 18- 22, 1992. [Meyer92] Bertrand Meyer, Eiffel: The Language, Prentice Hall, New York, NY, 1992. [Murray92] Robert B. Murray, êA Statically Typed Abstract Representation for C++ Programs,ë Usenix C++ Conference Proceedings, USENIX Association (1992). [Koenig92] Andrew Koenig, êSpace-Efficient Trees in C++,ë Usenix C++ Conference Proceedings, USENIX Association (1992). [Stroustrup92] Bjarne Stroustrup, êRun Time Type Identification for C++,ë Usenix C++ Conference Proceedings, USENIX Association (1992). [Pinson91] Lewis J. Pinson, Richard S. Wiener, Objective-C: Object- Oriented Programming Techniques, Addison-Wesley, Reading, MA, 1991. [Smith90] K. Stuart Smith, Arunodaya Chatterjee, êA C++ Environment for Distributed Application Executionë and êResource Allocation in Distributed Object-Oriented Systems: A Proposed Solution,ë Microelectronics and Computer Technology Corporation, Austin, TX (1990), MCC Technical Report Number: ACT-ESP-015-91. Suggested Reading Andrew J. Palay, êC++ in a Changing Environment,ë Usenix C++ Conference Proceedings, USENIX Association (1992). Marion Bucko, Larry K. Raper, êOS/2 2.0 Technical Library System Object Model Guide and Reference,ë IBM Corporation, 1991, Document Number 10G6309. Richard Johnson, Murugappan Palaniappan, êMetaFlex: A Flexible Metaclass Generator,ë To appear in Conference Proceedings of ECOOP è93, July 26-30, 1993. (In meantime contact: Aldus Corporation, 411 First Avenue South, Seattle, WA 98104-2871.) __________________ 1Object-Oriented Component Application Architecture is a modular/dynamic approach to application design. Such an application will typically be built out of a foundation of shared libraries that export base classes. Such shared libraries (DLLs under Windows and OS/2) will usually consist of: application framework, graphics engine, data base/document engine, and various other shared, general services. The goal of the OOCAA approach is that it should be possible to upgrade and replace these shared libraries on an individual basis without adversely effecting existing applications and other shared libraries that are already deployed into the user base. Another aspect of the OOCAA design approach is that applications will be extensible via êdropping inë new object-oriented components. This variety of executable component will technically be a shared library but will provide implementation classes that are derived from base classes that are in the domain of the application proper. The extending classes of these add-on components will essentially graft onto the overall class inheritance hierarchy at application load time and application run time. Again, if the base classes that these classes inherit from are evolved in new shared library releases, then extension components should not be adversely effected. In short, OOCAA is a microcosm of object-oriented operating system attributes but applied at the level of the individual application. Appendix A The method dispatching code generation description presented in the main body of this paper is oriented toward native C++ compiler implementations. It assumes that the C++ compiler back-end generates object code directly for the intended target CPU. The 80386 CPU is used as a standard reference CPU given its widespread use. To get maximum efficiency it is often desirable to precisely control object code generation for C++ language features. C++ exception handling is an example where native C++ compilers hold a distinct advantage over CFront-based compilers in this regard. However, the evolvable class method dispatching code generation can also be easily expressed in high level C language, but with a bit more verbosity and hence a little less efficiency than that presented in the main body of this paper. Not only would this approach be amendable to CFont-based C++ compilers, it would also represent a shortcut route to implementing evolvable class support in native C++ compilers. This would be because all modifications could be made in the compiler front-end î no changes to the compiler back-end would be necessary. This alternative technique is easily and briefly presented in the source code example below î which in turn is based on the source code example taken from Listing 1. In this approach, the __DispatchMethod() function receives its arguments passed on the stack instead of in registers. It returns the dereferenced virtual function pointer as its function result. The virtual function pointer result is immediately dereferenced and executed with the method call arguments passed to it on the call stack. // synthesized declarations introduced prior to // the beginning of each source translation unit typedef struct __MethodSelector_tag { unsigned shortmethodIndex; unsigned shortclassIndex; char * methodName; } __MethodSelector;// typedefing is for benefit of C typedef struct __vtbl_entry_tag { void (*fct)(); } __vtbl_entry;// typedefing is for benefit of C typedef int (*__vFunc)(void*,...); // an ANSI C language prototype of the method dispatching routine extern __vFunc __DispatchMethod(__vtbl_entry **_windowed_vptr, __vtbl_entry **_base_vptr, __MethodSelector *_vcall_descriptor_ptr ); /*** beginning of actual .cpp source translation unit ***/ Menu::Menu(VString* itemName,MenuID id,BOOL largeMenu,BOOL isPopUp) : MenuItem(itemName,MenuItemID(id)) { fHMenu = HMENU(0); fLongMenu = largeMenu; fIsPopUp = isPopUp; fMenuItemID = (id == kNoResourceMenuID) ? ++gNextCreatedMenuID : id; fMenuItems = (LList*)NULL; if (gKnownMenus == NULL) gKnownMenus = new Dictionary; if (gKnownMenus != NULL) { DWORDCItem* theKey = new DWORDCItem(fMenuItemID); // if (!gKnownMenus->Find(theKey)) extern __MethodSelector __methSel_1;// Collection::Find() if ( ! ((int)(__DispatchMethod( *(((__vtbl_entry***)(Collection*)gKnownMenus) + 1), *(((__vtbl_entry***)(Dictionary*)gKnownMenus) + 1), &__methSel_1) (gKnownMenus,theKey))) ) { // gKnownMenus->AddKeyValuePair(theKey,this); extern __MethodSelector __methSel_2;// Dictionary::AddKeyValuePair() ((void)(__DispatchMethod( *(((__vtbl_entry***)(Dictionary*)gKnownMenus) + 1), *(((__vtbl_entry***)(Dictionary*)gKnownMenus) + 1), &__methSel_2) (gKnownMenus,theKey,this))); } } /*** ending of actual source translation unit ***/ // synthesized anonymous static variables extern char __methNameStr_1[]; extern char __methNameStr_2[]; static __MethodSelector __methSel_1 = { 0, 3, __methNameStr_1 }; static __MethodSelector __methSel_2 = { 0, 5, __methNameStr_2 }; static char __methNameStr_1[] = "@meth@Collection@Find$qn5CItem"; static char __methNameStr_2[] = "@meth@Dictionary@AddKeyValuePair$qn5CItemt1";