Friday, 28 March 2014

Fast casting of C# Structs with no unsafe code (but still kind of "unsafe")

C++ allows us to perform any casting between memory pointers. It's basically up to you to ensure the correct types are casted to prevent memory problems.

C# however doesn't allow to do this out of the box, unless you go into using unsafe code and perform the pointer conversion yourself, pretty much like in C++.

Problem is that unsafe code is not supported in all platforms, and generally it's a good idea to avoid using it as long as you can.

So, imagine we have two structs like this:


    public struct STA
    {
        public int CustomerID;
        public float CustomerRate;
    }
    public struct STB
    {
        public int CustomerID;
        public float CustomerRate;
    }

One of them is yours, and the other one comes from an APIs or legacy software you don't have access to. Now, imagine you need to convert one into another. How would you face that?

Obviously, if you try to simply assign them, it just won't work:



Of course, the most evident (ans safest) solution is to create a new struct of the type STB and copy the contents from STA to STB:

struct_a = new STA(struct_b.CustomerID, struct_b.CustomerRate);

The drawback is that this approach is slow and implies a memory overhead, what might not be an option sometimes.

If performance is a critical issue, you are sure that both structs are 100% compatible and share the exact same memory layout, and that both come from compatible platforms... Why not fooling the compiler and make it just assume that they are compatible types? 

As we mentioned, in unsafe C# code this can be simply achieved by casting pointers, just like in C++. But if you mark your C# code as unsafe, it can be rejected in some platforms. Is there a way to do that without using unsafe code? Yes, there is.

C# StructLayout to the rescue

Perfectly safe C# code allows you to explicitly define the offset of struct members, using attributes from System.Runtime.InteropServices, just like this:


    [StructLayout(LayoutKind.Explicit)]
    public struct STA
    {
        [FieldOffset(0)]
        public int CustomerID;
        [FieldOffset(4)]
        public float CustomerRate;
    }

This allows you to do tricky things like settings two different members of the struct at the same offset, creating something similar to C++ Unions:


    [StructLayout(LayoutKind.Explicit)]
    public struct Union
    {
        [FieldOffset(0)]
        public STA StructA;
        [FieldOffset(0)]
        public STB StructB;
    }

Note that both StructA and StructB are at the same field offset, and therefore will occupy the exact same location in memory. As both share the same memory layout, the result is that you have ONE single object in memory, and two different references (kind of pointers) to them, each one using a different type. 

Now, we can do the following:


            STA struct_a;
            STB struct_b;
            ...
            Union stu = new Union();
            stu.StructB = struct_b;
            struct_a = stu.StructA;

As you can see, no new STA has been created in memory, and we have saved all the process of copying data from one struct to another.

However, please be aware that this is kind of cheating... You are fooling the compiler to accept that, but in practice you are performing a classical pointer conversion, even if you are using purely safe code.

PLEASE BE AWARE that this approach doesn't take into account endianness. Different platforms, with different byte endianness, may store bytes in the opposite way. For example, if STA comes from a big-endian platform, and STB works in a little-endian platform (or just the opposite), bytes will be reversed when doing this operation. It doesn't take into account differences in data types either, so you must be very careful to ensure that all types have the same size in one struct and the other.

So, remember:
if(same endiannes & same data types) 
                              you are good to go !

Functional improvements

The Union struct we have created can be made much more comfortable to use if you add operators to it.

For example, comparison operators like this:

 public static bool operator ==(STA left, Union right)
        {
            return left == right.StructA;
        }
        public static bool operator ==(STB left, Union right)
        {
            return left == right.StructB;
        }

Will allow you to simply compare Unions with the original types:

if(union == struct_a)

And even more comfortable, adding implicit operators like this:

        public static implicit operator Union(STA value)
        {
            Union ret = new Union();
            ret.StructA = value;
            return ret;
        }

Will allow you to simply assign one type to the other like this:

            STA struct_a;
            ...
            Union union = struct_a;

Memory footprint improvements

One small drawback of this approach is the need to create structs of the type Union, each time you want to perform a conversion of this kind. A simple solution is to perform the operation in a static Union object. It's a bit messy, but it works. For instance, if you declare the class like this:

    [StructLayout(LayoutKind.Explicit)]
    public struct Union
    {
        [FieldOffset(0)]
        public STA StructA;
        [FieldOffset(0)]
        public STB StructB;

        public static Union StaticRef = new Union();

        public static STA ToSTA(STB pStructB)
        {
            StaticRef.StructB = pStructB;
            return StaticRef.StructA;
        }
        public static STB ToSTB(STA pStructA)
        {
            StaticRef.StructA = pStructA;
            return StaticRef.StructB;
        }
    }

You can now re-use the same static object over and over again, doing things like:

            STA struct_a;
            STB struct_b;
            ...
            struct_a = Union.ToSTA(struct_b);

Hope it helps!! Cheers...

1 comment:

Anonymous said...

Nice example, but I wouldn't use static variable at all. Allocating new struct on stack (local variable) is very fast, it just increments stack pointer by size of struct. Copying value to this variable should be comparable of faster than copying into static variable.

Biggest advantage is that it's thread-safe. Making static variable [ThreadStatic] isn't good idea either, it will consume same memory as local variable, but it won't "free" it after use...and access to it will be probably slower. Sticking to local variable for conversions is best idea IMHO.