About Jolt.NET Libraries

Inspired by the Boost C++ libraries, Jolt.NET aims to complement the .NET Base Class Library (BCL) with algorithms, data structures, and general productivity tools. It is the hope of the authors that the features of Jolt.NET will one day be part of, or represented in the BCL and the .NET Framework.

Anonymous Methods and Jolt Functors: Behind the Scenes

Good day all!  I haven’t written a meaningful article for some time now as I’ve been fairly occupied with work tasks during the past few weeks.  I’ve been dabbling with an implementation of the Jolt.Functional library, which includes some C++-like functor manipulators (e.g. std::bind, functor composition, among others), and I thought it would be diligent to describe the motivation behind this tasks by showing what anonymous methods and lambda expressions really look like in MSIL.

For the purposes of this article, I will be analyzing the output of the C# 3.0 compiler.  Your .NET compiler may output something different, but at the time of writing this I suspect that the output will be similar as there are no native MSIL constructs to represent an anonymous method (and as you may see below, in all likelihood there is no reason to support such constructs).

MSIL Representation of Anonymous Methods

The C# 3.0 anonymous method syntax provides a convenient way for inlining method declarations within the body of other methods.  To declare an anonymous method, you can use either the delegate keyword or the less cumbersome lambda expression syntax.  The latter case is preferred as the compiler can generally infer the types of the method arguments from the usage of the expression.  Here is an example using both syntaxes to declare a predicate that determines if a given integer is equal to 100.



class FunctorExploration
{
void CreateFunctors()
{
Predicate<int> isOneHundred = value => value == 100;
Predicate<int> is100 = delegate(int value) { return value == 100; };
}
}

Both of the given syntaxes present the illusion that an inline method is created with local method scope.  However, since MSIL has no notion of an anonymous or inline method declaration, the compiler’s transformation of this source results in MSIL that is semantically different.  As a software developer, it is important to be aware of these differences as they may affect expected performance.  Here is the MSIL representation of the CreateFunctors() method, with only one of the predicate declarations.



.method private hidebysig instance void CreateFunctors() cil managed
{
.maxstack 3
.locals init (
[0] class [mscorlib]System.Predicate`1<int32> isOneHundred)
L_0000: nop
L_0001: ldsfld class mscorlib]System.Predicate`1<int32> FunctorExploration.FunctorExploration::CS$<>9__CachedAnonymousMethodDelegate1
L_0006: brtrue.s L_001b
L_0008: ldnull
L_0009: ldftn bool FunctorExploration.FunctorExploration::<createfunctors>b__0(int32)
L_000f: newobj instance void [mscorlib]System.Predicate`1<int32>::.ctor(object, native int)
L_0014: stsfld class [mscorlib]System.Predicate`1<int32> FunctorExploration.FunctorExploration::CS$<>9__CachedAnonymousMethodDelegate1
L_0019: br.s L_001b
L_001b: ldsfld class [mscorlib]System.Predicate`1<int32> FunctorExploration.FunctorExploration::CS$<>9__CachedAnonymousMethodDelegate1
L_0020: stloc.0
L_0021: ret
}

This function first attempts to load the value of the static field CS$<>9__CachedAnonymousMethodDelegate1 on to the evaluation stack.  If the value is null, then the static field is initialized to a delegate that refers to our isOneHundred predicate function.  The method referred to by the delegate is given as FunctorExploration::b__0(int32), which is a compiler-generated method that contains our predicate implementation.  Once the cache look-up is completed, the value of the static field is assigned to the local variable isOneHundred.  Here is an abbreviated version of the MSIL that declares the FunctorExploration class.



.class private auto ansi beforefieldinit FunctorExploration
extends [mscorlib]System.Object
{
.method private hidebysig static bool <createfunctors>b__0(int32 'value') cil managed
{
.custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor()
.maxstack 2
.locals init (
[0] bool CS$1$0000)
L_0000: ldarg.0
L_0001: ldc.i4.s 100
L_0003: ceq
L_0005: stloc.0
L_0006: br.s L_0008
L_0008: ldloc.0
L_0009: ret
}

.method private hidebysig instance void CreateFunctors() cil managed { }

.field private static class [mscorlib]System.Predicate`1<int32> CS$<>9__CachedAnonymousMethodDelegate1
{
.custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor()
}
}

The purpose of the cached delegate is to make sure that there is only ever one instance of it in your code.  Whenever you pass the delegate around, you will always refer to the instance held in the static memory of the class.  Furthermore, there is no way to access this instance without using runtime reflection; the compiler and IDE hide all of the generated code from you.

What happens if a method contains two or more anonymous method declarations?  If you surmised that the generated code depicted above is appropriately emitted for each declaration, then you are correct.  What is interesting is that if two anonymous methods are functionally equivalent (as in the first example), then the compiler will duplicate the method implementation, static field, and code to initialize the static field.  Here is an abbreviated version of the MSIL that declares the FunctorExploration class from the initial example containing two predicates.



.class private auto ansi beforefieldinit FunctorExploration
extends [mscorlib]System.Object
{
.method private hidebysig static bool <createfunctors>b__0(int32 'value') cil managed
{
.custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor()
.maxstack 2
.locals init (
[0] bool CS$1$0000)
L_0000: ldarg.0
L_0001: ldc.i4.s 100
L_0003: ceq
L_0005: stloc.0
L_0006: br.s L_0008
L_0008: ldloc.0
L_0009: ret
}

.method private hidebysig static bool <createfunctors>b__1(int32 'value') cil managed
{
.custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor()
.maxstack 2
.locals init (
[0] bool CS$1$0000)
L_0000: nop
L_0001: ldarg.0
L_0002: ldc.i4.s 100
L_0004: ceq
L_0006: stloc.0
L_0007: br.s L_0009
L_0009: ldloc.0
L_000a: ret
}

.method private hidebysig instance void CreateFunctors() cil managed
{
.maxstack 3
.locals init (
[0] class [mscorlib]System.Predicate`1<int32> isOneHundred,
[1] class [mscorlib]System.Predicate`1<int32> is100)
L_0000: nop
L_0001: ldsfld class [mscorlib]System.Predicate`1<int32> FunctorExploration.FunctorExploration::CS$<>9__CachedAnonymousMethodDelegate2
L_0006: brtrue.s L_001b
L_0008: ldnull
L_0009: ldftn bool FunctorExploration.FunctorExploration::<createfunctors>b__0(int32)
L_000f: newobj instance void [mscorlib]System.Predicate`1<int32>::.ctor(object, native int)
L_0014: stsfld class [mscorlib]System.Predicate`1<int32> FunctorExploration.FunctorExploration::CS$<>9__CachedAnonymousMethodDelegate2
L_0019: br.s L_001b
L_001b: ldsfld class [mscorlib]System.Predicate`1<int32> FunctorExploration.FunctorExploration::CS$<>9__CachedAnonymousMethodDelegate2
L_0020: stloc.0
L_0021: ldsfld class [mscorlib]System.Predicate`1<int32> FunctorExploration.FunctorExploration::CS$<>9__CachedAnonymousMethodDelegate3
L_0026: brtrue.s L_003b
L_0028: ldnull
L_0029: ldftn bool FunctorExploration.FunctorExploration::<createfunctors>b__1(int32)
L_002f: newobj instance void [mscorlib]System.Predicate`1<int32>::.ctor(object, native int)
L_0034: stsfld class [mscorlib]System.Predicate`1<int32> FunctorExploration.FunctorExploration::CS$<>9__CachedAnonymousMethodDelegate3
L_0039: br.s L_003b
L_003b: ldsfld class [mscorlib]System.Predicate`1<int32> FunctorExploration.FunctorExploration::CS$<>9__CachedAnonymousMethodDelegate3
L_0040: stloc.1
L_0041: ret
}

.field private static class [mscorlib]System.Predicate`1<int32> CS$<>9__CachedAnonymousMethodDelegate2
{
.custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor()
}

.field private static class [mscorlib]System.Predicate`1<int32> CS$<>9__CachedAnonymousMethodDelegate3
{
.custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor()
}
}

Since anonymous methods are local to a function at design time, they may also access other local variables that share the same scope.  Let us generalize our functor to test for equality as follows.



class FunctorExploration
{
void CreateEqualityPredicate<TValue>(TValue predicateConstant)
{
Predicate<TValue> equals = value => value.Equals(predicateConstant);
}
}

When the compiler encounters this type of anonymous method declaration, it emits code that is substantially different from the previously shown static method cases.  Here is the MSIL representation of the CreateEqualityPredicate() method.



.method private hidebysig instance void CreateEqualityPredicate<TValue>(!!TValue predicateConstant) cil managed
{
.maxstack 3
.locals init (
[0] class [mscorlib]System.Predicate`1<!!TValue> equals,
[1] class FunctorExploration.FunctorExploration/<>c__DisplayClass1<!!TValue> CS$<>8__locals2)
L_0000: newobj instance void FunctorExploration.FunctorExploration/<>c__DisplayClass1<!!TValue>::.ctor()
L_0005: stloc.1
L_0006: ldloc.1
L_0007: ldarg.1
L_0008: stfld !0 FunctorExploration.FunctorExploration/<>c__DisplayClass1<!!TValue>::predicateConstant
L_000d: nop
L_000e: ldloc.1
L_000f: ldftn instance bool FunctorExploration.FunctorExploration/<>c__DisplayClass1<!!TValue>::<CreateEqualityPredicate>b__0(!0)
L_0015: newobj instance void [mscorlib]System.Predicate`1<!!TValue>::.ctor(object, native int)
L_001a: stloc.0
L_001b: nop
L_001c: ret
}

This method first instantiates the compiler generated type FunctorExploration/<>c__DisplayClass1<!!TValue> (a nested type), and then proceeds to initialize a field on this type instance with the value of the local variable of interest.  The equals predicate is then initialized from a compiler generated method given as <>c__DisplayClass1<!!TValue>::<CreateEqualityPredicate>b__0(!TValue 'value').  Here is an abbreviated version of the MSIL that represents the compiler-generated type.



.class auto ansi sealed nested private beforefieldinit <>c__DisplayClass1<TValue>
extends [mscorlib]System.Object
{
.method public hidebysig instance bool <CreateEqualityPredicate>b__0(!TValue 'value') cil managed
{
.maxstack 2
.locals init (
[0] bool CS$1$0000)
L_0000: ldarga.s 'value'
L_0002: ldarg.0
L_0003: ldfld !0 FunctorExploration.FunctorExploration/<>c__DisplayClass1<!TValue>::predicateConstant
L_0008: box !TValue
L_000d: constrained !TValue
L_0013: callvirt instance bool [mscorlib]System.Object::Equals(object)
L_0018: stloc.0
L_0019: br.s L_001b
L_001b: ldloc.0
L_001c: ret
}

.field public !TValue predicateConstant
}

The new type is created to give the anonymous method implementation its own reference to the local variable of interest.  Whenever CreateEqualityPredicate() is invoked, a new instance of the type is returned, initialized with whatever value is given as a parameter to our function.  The delegate can no longer be static since we are dealing with instance data.

What happens when we declare multiple delegates in this fashion?  As in the corresponding example, a new type will be created for each anonymous method that uses a local variable from an accessible sibling scope.  Again, code duplication arises if there are two anonymous method declarations that are equivalent in functionality.  Also, as mentioned before, there is no way to access the generated entities at design time without using runtime reflection.

Jolt.Functional Motivation

The introduction of Linq in the .NET Framework 3.5 has made anonymous methods and important language feature.  Without them, it is very tedious to define predicates in your code using function declaration syntax, then refer to them in your Linq filter criterion.  However, the advantage taking such an approach is that one is more likely to notice patterns in the predicate code and thus have the opportunity to refactor for code reuse.  This is not possible with compiler-generated code, and each time you introduce an anonymous method, you increase the amount of code duplication that may occur.

The Jolt.Functional library aims to minimize the amount of compiler-generated code by promoting functor reuse through the use of generic delegate declarations for commonly-used delegates, along with generic delegate factory methods for manipulating existing functions.  Here are some examples of these types of constructs.


  • Creation of idempotent functions (returning a constant value for any input)
  • Predefined idempotent functions (true for all, false for all)
  • Parameter binding and function composition
  • Predicate<T> adaptor for Func<T, bool> (and vice versa)
  • Action<T> adaptor for Func<T, TResult>

To illustrate the benefit of such constructs, consider the generic CreateEqualityPredicate() method from a previous example.  This method creates a predicate for each invocation.  However since its implementation is generic, the compiler has only generated one implementation of the predicate method and supporting types. Compare this with inlining many lambda expressions that test for various types of equality.  The functionality of the code is the same, but the MSIL and assembly size has grown dramatically!

For those concerned about assembly code bloat from anonymous methods, the Jolt.Functional library will help by reducing the amount duplicate compiler-generated code in your program.  Also, if your language does not support anonymous methods, the Jolt.Functional library can help you organize your method declarations so that they may be combined and reused to form more complicated expressions.

Wiki Documentation Formatting

Please bear with the poor formatting in the wiki documentation for Jolt.NET.  Sometime within the past few weeks, the CodePlex wiki parser was changed and consequenly any inline formatting is being incorrectly parsed.  This is causing certain style formats to run through to the end of the document, and other inline code formatting to be applied on a new line.

These change are out of my control and hopefully this will be resolved very quickly!