Saturday, March 8, 2014

Strong Typing and Initialization for Generic Vectors

With C++11 becoming more and more stable I decided to talk about some of the features C++11 brings to the table that can enforce strong typing, and how that will affect parts of MiniLib, my minimalistic general purpose library, or more specifically the part of the library relating to linear algebra.

I often see vector and matrix types using hard-coded dimensions, often taking the shape of Vector2, Vector3, Vector4, Matrix2, Matrix3, and Matrix4 classes being separate from each other. There are some benefits to this. Knowing exactly the dimensions of you data types can lead to some good hand optimizations, most notably using SIMD functionality. This method also enforces strict typing, where type mismatches are caught at compile-time. However, it does also mean that a lot of redundant coding, which is a waste of time as well well as an increase in error rate (more code = more errors).

Another way of writing vector and matrix classes is to dynamically allocate dimensions. However, this brings a severe problem; There is nothing inherently different between, say, a three dimensional vector and a two, or four dimensional vector. This means that vectors of different sizes can mix with each other unless the programmer does sanity checks when performing operations on two vectors or matrices moving errors from compile-time to run-time when the former should be favored. Aside from this, dynamically allocated vectors and matrices are difficult for the compiler to automatically optimize, since their size is not known at compile-time.

A third option for implementing linear algebra types is templates. Templates can be used to define sizes of arrays. This brings a few advantages; it enforces strong typing, it avoids unnecessary duplication of code, and it makes automated optimizations more obvious for the compiler, while at the same time leaving the option open for the programmer to specialize a template for specific hand optimizations. This is the implementation that I have chosen for MiniLib. All is not well however. The old C++ standard has had a long standing, glaring hole where construction of templated vectors and matrices have proven problematic. Simply put, there is no way to initialize a number of elements in a non-aggregate through a constructor without writing a specialized constructor taking the exact number and type of arguments that are expected by the object construction.

There have been a few ways around this though; Passing a single data structure that holds the data to be copied to the constructor, using a makeshift initializer list by overloading the comma operator and passing that to the constructor, and by using the variable argument count language construct. While the first option is the lesser of the three evils, using a separate data structure to hold data does not really solve the problem, it only shifts it from the main data type to the data structure holding the data to be copied. The second option is very much like the first one, only it tries to solve the problem through obscure means (i.e. overloading an operator that is best left alone). It should be noted that the initializer list might be implemented without using the comma operator, but in the end it is still a very inelegant solution. The makeshift initializer might also not be performant unless care is taken in its implementation.

Taking this into account I landed on using the variable argument count method, the same method used for the standard C function print(). It looks something like this:



However, this is a somewhat dangerous road to tread. Not only is there a performance penalty involved for iterating through the variable argument list, but there is also nothing in the way of type safety. There is not even any mechanism in place for guaranteeing that the actual number of input parameters match the expected number of input parameters. This could potentially lead to trouble if the programmer using the library is not careful.

This is where the C++11 standard comes in and saves the day. C++11 provides some excellent tools for overcoming these limitations. There is the option of std::initializer_list, and there is the option of variadic templates. One is an extension of the standard library, the other is an extension of the language itself, but both of these alternatives provide compile-time safety. To be fair, the former option most likely also uses language extensions behind the scenes.





As you can see C++11 has opened up the possibility for elegant handling of purely mathematical data types by providing a number of tools used here to create strong typing that can be enforced at compile-time. Unfortunately, while I am incredibly tempted to implement these solutions into my own linear algebra library I still feel somewhat hamstrung by the current rate of adoption of C++11. Newer compilers may handle all of this with ease, but I seldom have the luxury to work with the latest and greatest. Still, people looking to implement flexible linear algebra libraries should look into the new features in C++11.

No comments:

Post a Comment