Smallest Possible Attribute System, Part 3

Okay, barring suggestions from you, I'm done with SPAS. It does what I wanted it to do, and there's enough substance there for SPAS to be useful as a tutorial. The final result is here: AttributeBinder.h, AttributeBinder.cpp, and you'll need the StringMap.

Our Story To Date

In the first edition, I introduced a tiny attribute system based around a Binder per class, and a Descriptor per attribute within each bound class. By reader suggestion, I extended it to allow for inheritance, although I didn't wind up supporting multiple inheritance; you can read some of my research as to why multiple inheritance is not great for performant code here.

An update, changed the typeName in the Descriptor to be a const char* instead of a std::string. This was born from the realization that the binders are all statically initialized any way, so why duplicate the string data? Per reader comments, it turned out to be possible to eliminate more string copies.

Then, I introduced a type mechanism to replace the use of C++ RTTI.

Next I introduced Bentley & Sedgewick's ternary search tree to make type look up fast, and to allow typeId to be cached as an int. This code introduced some thread-unsafety, which could be mitigated by putting a critical section at the spot indicated in the StringMap code, or in the type registration code. My logic for not doing so at the moment is that the classes are all bound at start, before main begins, and as the user won't have been able to launch any threads yet, there's no possibility of a stomp. If you were to add types later in a threaded runtime, a thread safety method would be needed. An alternative to the proposed critical section would be to implement a lock-free CAS style mechanism to the search tree; generally these are difficult to write, but given some similarities in the tree to a list, it might not be difficult to implement.

Meanwhile, back at the ranch

I had a look at reasonable applications for SPAS, and came up with:

Script variable binding to C++ data
Disk serialization
Network transmission
Editor binding

To support all of that, I thought it would be good to show binary reflection as part of the tutorial. Suddenly, the Smallest Possible Attribute System became not the smallest possible.

Feature Creep

I had certain design features I wanted to hit, like

having the reflected data generally immune to changes to the reflected classes. I also wanted to keep the reflected data as small as possible. That meant I needed a type and variable table in the reflected data.
reflecting contained objects as well as pointed-to objects. This introduced the need for class factories.
no overhead or wrappers on bound variables, reinterpret_cast is the means to get at a variable
does not impose vtables
minimal programmer interface
no templated attribute types - I want to keep the code as light as possible without needing to specialize the code per type
support pod (plain old data)
does not use built in C++ RTTI mechanisms - disassembly of typeinfo operations revealed a shocking world of string compares
text serialization would be an exercise to the reader. I personally don't need it, although the tutorial code includes a trivial version using ostream and stringstream.

Current implementation limitations, that exist for no particularly good reason except that I don't currently need them, or that they would obscure the tutorial nature of the code:

No platform swizzles, so data can't be serialized on one machine and deserialized on a machine with different endianness.
No cross-compatibility between 32 and 64 bit systems.
No automatic padding to ensure aligned reads and writes.
No support for arrays of data
No support for multiple inherited objects
No support for STL containers

I refer you to the implementation and surrounding documentation to discover the serialization algorithm.

How about an example?

A minimal set of macros is used to reflect an object. Here's an example of three classes, demonstrating containment, pointers, and inheritance.

class Bar { public: BIND_START; BIND(Bar, yay, float); BIND_END; Bar() : yay(33) { } float yay; }; BIND_ATTRIBUTES(Bar);

class Foo { public: struct FailType { }; BIND_START; BIND(Foo, myInt, int); BIND(Foo, myFloat, float); BIND(Foo, stuff, char*); BIND(Foo, stuff2, std::string); BIND(Foo, yow, bool); BIND(Foo, fail, FailType); BIND(Foo, bar, Bar); BIND_END; Foo() : myInt(1), myFloat(2), yow(false) { strcpy(stuff, "Some text"); stuff2 = "More text"; } int myInt; float myFloat; char stuff[32]; bool yow; Bar bar; std::string stuff2; FailType fail; }; BIND_ATTRIBUTES(Foo); class Baz : public Foo { public: BIND_START; BIND_BASE(Foo); BIND(Baz, bool1, bool); BIND(Baz, int2, int); BIND(Baz, float3, float); BIND(Baz, nullBarTest, Bar*); BIND(Baz, barTest, Bar*); BIND_END; Baz() : bool1(true), int2(2), float3(3), nullBarTest(0) { barTest = new Bar(); barTest->yay = 12.0f; } bool bool1; int int2; float float3; Bar* nullBarTest; Bar* barTest; }; BIND_ATTRIBUTES(Baz);

These macros introduce a static member of type AttributeBinder to every class. AttributeBinder provides a few interesting methods - WriteBinary, ReadBinary, and a StringMap of attribute names to AttributeDescriptors. AttributeDescriptors know how to find the member in the class, how big a member is, and a little descriptive information such as type. In general, one would look up a variable on the AttributeBinder, and then turn the descriptor into a pointer.

 Baz myFoo; AttributeBinder::AttributeDescriptor ad; Foo::binding.attribs.find("myInt", &ad); int* intPtr = (int*) ad.DataAddr(&myFoo);

An Exercise for the Reader

I've left error handling as a dreaded exercise to the reader, especially since everyone's error handling mechanisms are so different, depending on the application. One could specify an error handling policy or trait on the AttributeBinder, but whatever.

The code is heavily documented (it's not self-documenting, I documented it!), and available here:

AttributeBinder.h, AttributeBinder.cpp

Enjoy! With a bit of thought some of the limitations, cruftiness, and bloat could be eliminated to make this code actually the Smallest Possible Attribute System.

Research

General searching revealed a number of homebrew type systems, and a variety of sophisticated implementations such as that in boost python. To show why SPAS has a purpose in the face of so many alternatives, I turn to Game Programming Gems volume 2 where examples and tutorial explanations of all the common mechanisms can be found. I recommend this volume for those of you wanting to do your own research into this topic.

Scott Wakelin, Dynamic Type Information, Game Gems 2, pp. 38-45

Similarities:

Type info is a class statically embedded in each reflectable class.
Has a pointer to parent class to encode a class hierarchy.
No support for multiple inheritance, although it should be possible in Scott's system due to reliance on C++ RTTI.
Type equivalence can be determined for the class, or downcasts.

Differences:

Scott's system uses C++ RTTI in order to check type equivalence of upcasts.
Scott's system uses dynamic_cast, which requires the existence of a vtable.
Scott's system uses the << operator which I avoid because I am not at all pleased with the disassembly I see, nor am I pleased with the appearance of << execution during profiling
Scott's system doesn't handle pointer types

Charles Capelli, A Property Class for Generic C++ Member Access, Game Gems 2, pp. 46-50

Charles' system is similar in intent to C# properties. He introduces a property class to wrap each variable, and a property set to iterate over them. SPAS also has a property set which can be iterated. Charles' system is a traditional templated specialization system.

Kasse Staff Jensen, A Generic Tweaker, Game Gems 2, pp. 118-126

Kasse's system shares with SPAS the design goal of not imposing overhead on reflected variables, and the goal of having an absolutely minimal API burden. Kasse's system uses a templated class type to identify things, and imposes a vtable to get C++ RTTI.

Similarly to SPAS, Kasse's system uses a map to go from the name of a variable to the variable, and it uses reinterpret_cast to get at the real variable.

Spaces Between

2008-07-07 00:48:27