About Jolt.NET Libraries

Inspired by the Boost C++ libraries, Jolt.NET aims to complement the .NET Base Class Library (BCL) with algorithms, data structures, and general productivity tools. It is the hope of the authors that the features of Jolt.NET will one day be part of, or represented in the BCL and the .NET Framework.

Jolt.NET 0.4 Release and Future Features

Good day readers!

Yesterday I completed the final work items for Jolt.NET 0.4 and produced the release, available for download at CodePlex.  As mentioned earlier, this release was primarily a maintenance release aimed at addressing some long standing code quality issues and external library upgrades.  However, the release does contain some new features, summarized below.

  • Jolt.NET XML Assertions are now compatible with the Visual Studio test environment, through a set of adapter types
  • XML doc comment parsing for MethodInfo objects that refer to an operator
  • Generated proxy assemblies may now be signed programatically or via XML application configuration
  • All Jolt.NET assemblies are built with a strong name, enabling integrity verification and use with other strong-name assemblies

For more information on these features, please refer to the Jolt.NET library documentation.

The feature list for next release of Jolt.NET remains undefined as I am planning to take on a commercial project that will use the Jolt.NET libraries.  Jolt.NET features will be fueled by the requirements of this new project, but features won't be added unless they are sensible additions for a library.  I do maintain a short-list of potential feature work, and will say that the following features are likely to make the cut for the next Jolt.NET release.

  • Verification of the equality axiom, for implementations of Object.Equals and IEquatable<T>
  • XmlEqualityAssertion and XmlEquivalencyAssertion implementations that accept XPath statements

Do you have any feature requests?  If so, please post them on the work item page, or vote-up existing tasks!

Limitations to XML Doc Comment Parsing

Hello everyone!

I've been working on improving the support for XML doc comment queries with metadata types, but came across some limitations in the reflection API that have hindered my progress.  Specifically, I've been trying to add support for retrieving doc comments for methods containing a function pointer as a parameter, and methods containing a parameter that has been decorated with an optional or required custom modifier.  Unfortunately, given the current state of the reflection API (for which I will elaborate upon below), I don't believe that I can implement these features and achieve a general solution that works for all inputs.


The goal of this feature is to be able to retrieve the XML comments as given in the following code example.

public ref class T
{
public:
/// <summary/>
/// <param name="param">A function pointer of managed types</param>
void FnPtrMethod(String^ (*param)(TimeSpan, DateTime, array<int>^)) { }
};

Since a function pointer is a native C++ entity, it is not recognized by the CLR and consequently I don't believe it to be possible to adequately represent it with a metadata object.  Given the method FnPtrMethod above, the managed type of its sole parameter param is rendered by the reflection API as System.IntPtr.  This is understandable sense since the function pointer is really a raw native pointer to the address of executable code.  It makes no difference that its arguments are managed types, since the type that encompasses them is not managed. If all we have to work with is an IntPtr instance, then it is impossible to infer that the IntPtr is really a function pointer and fetch its managed arguments for processing.

I'm fairly confident that my analysis is correct, but in the hope of being wrong I've started a discussion on StackOverflow just to be sure.  If you know the trick to get this to work, please respond to this blog entry or to the discussion on StackOverflow.


The goal of this feature is to be able to retrieve the XML comments as given in the following code example.

public ref class T
{
public:
/// <summary/>
/// <param name="x">modopt</param>
/// <param name="y">modreq</param>
void modifiers(const int x, volatile int y) { }
};

This feature may be implemented, but only for a small number of variations in the types of method parameters.  The problem lies in the placement of the ParameterInfo.GetOptionalCustomModifiers() and ParameterInfo.GetRequiredCustomModifiers() methods.  These methods are specific to a parameter, and I believe that they should be generalized and applied to any instance of System.Type.  For instance, consider the following method.

public ref class C
{
public:
/// <summary/>
/// <param name="param">Generic action containing modreq/modopt generic argument.</param>
generic <typename T>
void f(Action<const volatile T>^ param) { }
};

The sole parameter param of function f does not have a custom modifier, however the generic method argument T that participates in the definition of the parameter type does.  In this case, it doesn't make sense to call param.GetOptionalCustomModifiers() since the modifiers apply to T.  To get the metadata for T, we need to call GetGenericArguments() resulting in a System.Type array.  But, System.Type does not provide a means to get its custom modifiers!  A similar problem also exists with array and pointer types as custom modifiers are applied to the element type of these entities.  We need to call GetElementType() on the array/pointer type, but are again presented with a System.Type instance and with no way to get its custom modifiers.

Regarding this issue, I've opened a bug report with Microsoft in hopes that the API flaw will be addressed in a future version of the .NET Framework.

However, all is not lost.  We can still use the ParameterInfo.GetOptionalCustomModifiers() and ParameterInfo.GetRequiredCustomModifiers() methods to get custom modifiers for a parameter that is not an array, pointer, or decorated with the by-ref/out attributes (all of these types require us to call GetElementType() to get the type to which the modifier is applied).  Also, the parameter can not be dependent on a generic method or type parameter having a custom modifier applied to it.  This will at least expose the feature to a small number of methods, but it clearly won't work in the general case.  Consequently, I've decided to postpone implementing this feature until I get clearer direction from the community.  If you would like to see this feature implement in Jolt.NET, please vote it up on the Jolt.NET CodePlex web site.

Recent Jolt.NET Revisions and Xml Doc Comment Parsing

I try to make a habit of posting to the development blog each time a significant feature or piece of code gets committed to the source repository.  Consequently, I would like to use this post to summarize what has been committed in the past month as several updates have been made.  Also, I'll describe some of the work I've been doing on matching XML doc comment elements with their corresponding metadata type from the System.Reflection namespace.

Commit Summary

Maintenance and refactoring is the main aspect of the Jolt.NET 0.4 release.  Prior to this release, I've intentionally delayed many code clean-up tasks as well as performing upgrades of 3rd party dependencies so that I could work on more important features.  Now that all those features are complete, I have spent some time to restore the code to its "pristine" state.  Here is a summary of the recent updates related to maintenance.

  • Updated QuickGraph dependency to QuickGraph 3.3.40824
    • Removed FSM->MSAGL conversion code as MSAGL is no longer supported by QuickGraph (superseded by GLEE)
    • Removed explicit implementation of equality semantics for Transition class as it is now supported natively by QuickGraph's EquatableEdge type
  • Updated RhinoMocks dependency to RhinoMocks 3.6
    • Modified relevant unit tests to utilize Act-Arrange-Assert syntax
  • Update NUnit dependency to NUnit 2.5.2
    • Adopted the use of new constraints to simplify and/or strengthen existing unit tests
    • Added additional unit tests to verify presence of attributes on types and their members, a task facilitated by the new constraints in NUnit 2.5
  • Unit test maintenance
    • Fixed many issues that prevented unit tests from being run in an NUnit project (aggregating many test fixtures)
    • Moved much of the reflection code for accessing types and members by strings into separate classes, improving the readability of some unit tests

The following commits introduced new features that were previously planned to be included with the Jolt.NET 0.4 release.

  • The Jolt.Convert class will now correctly generate the XML doc comments representation of an explicit or implicit operator
    • Predefined .NET operators were already supported
    • Consequently, you can now process XML doc comments with a System.Reflection.MethodInfo type referring to an operator
  • Created the Jolt.Testing.Assertions.VisualStudio.XmlAssert class to integrate the Jolt XML assertions to the Visual Studio test framework

XML Doc Comment Processing

"Processing the XML File (C# Programming Guide)" describes the supported XML doc comment markup for various types, methods, parameters, and fields.  For a given metadata type instance, Jolt.Convert will produce the correct markup, with the exception of the following constructs.

  • Function pointer parameter (ELEMENT_TYPE_FNPTR)
  • Optional understanding modifier (ELEMENT_TYPE_CMOD_OPT)
  • Required understanding modifier (ELEMENT_TYPE_CMOD_REQ)
  • Pinned field (ELEMENT_TYPE_PINNED)
  • Dimensionless and rank-less array (ELEMENT_TYPE_GENERICARRAY)

In order to verify that my implementation is correct, I compare the output of a .NET compiler with the output of the Jolt.Convert class.  Since the C# language does not currently support these constructs directly, other means are required for testing the implementation of the Jolt.Convert class, which are demonstrated below.

To produce XML doc comments with ELEMENT_TYPE_FNPTR, ELEMENT_TYPE_CMOD_OPT, and ELEMENT_TYPE_CMOD_REQ markup, we may use the the C++/CLI compiler to compile the following class.


public ref class XmlDocCommentTest
{
public:
typedef int (*function_ptr)(char, int, double);

void fnptr(function_ptr f); // ELEMENT_TYPE_FNPTR
void mod_opt(const int n); // ELEMENT_TYPE_CMOD_OPT
void mod_req(volatile int n); // ELEMENT_TYPE_CMOD_REQ
};


Function pointers are a common construct for C/C++ developers, but understanding why const and volatile translate to optional and required modifiers requires some explanation.  Paul DiLascia covers this topic in his article "C++ At Work: Rationales, Highlights, and a Farewell".

ELEMENT_TYPE_PINNED is a bit more tricky since the general C++ literature on pinned pointers states that their usage is restricted to non-static local stack variables, which can not be decorated with XML doc comments.  However, the System.Runtime.CompilerServices namespace gives some hints on how discover that a field is pinned.  Unfortunately I do not know of a way to verify this behavior using a .NET compiler or other tool.

Finally, ELEMENT_TYPE_GENERICARRAY appears to be deprecated as I can not locate any reference to it in modern .NET documentation (apart from the aforementioned document).

In the mean time, I plan to implement support for all ELEMENT_TYPE_FNPTR, ELEMENT_TYPE_CMOD_OPT, and ELEMENT_TYPE_CMOD_REQ in Jolt.NET 0.4.  For ELEMENT_TYPE_PINNED, I will wait until the feature is highly requested or until I stumble upon a tool that will produce the desired output.

Jolt.NET 0.3 and Future Features

This past afternoon, I committed source revision #26346, containing the final feature work for Jolt.NET 0.3.  This release took longer than expected because I added features to the release after thinking it may be too small, and those features ended up taking longer to implement than expected.  In the future, I will try to make the releases more timely, as long releases make the project appear unmaintained.

Jolt.NET 0.3 contains the following new features:

Please refer to the issue tracker for all features covered by the release, and to the release page for download options.

Jolt.NET 0.4 will be a short maintenance release, which will involve the upgrade of external dependencies.  The dependency upgrade will allow me to simplify some existing code by utilizing a new external library feature, as well as learn how to use such features so that they may be applied in the future..  Furthermore, there have been many maintenance related tasks I’ve been neglecting, and I feel that now is a good time to address them.  Some new features will be added, but only those were previously identified and not included in a preceding release.

Error Reporting for Xml Equivalency Assertion

In my last post, I sketched out the algorithms dealing with XML equality and equivalency assertions. During the testing of my implementations, I discovered a fundamental problem with the XML equivalency algorithm that made it a nuisance to use: the reporting of the inequivalent element while ignoring element sequence can potentially be too generic and not very usable. In this post, I will describe some of the problems I encountered and the currently-implemented remediation. For conciseness, use of the the term inequivalent will imply that only element sequence ordering is ignored.

XML Equivalency Assertion Algorithm

When we attempt to determine if a list of nodes is equivalent to another list of nodes, we generally perform a set-equality operation on two list-like data structures, comparing each element in one list is to every element in the other list. Furthermore, given that the elements are not ordered, we can’t stop searching for an element once we detect the first mismatch since the element may exist further ahead in the list of elements to search. Reporting the precise element causing the inequivalency thus becomes difficult since we need to store extra information: should we stop the search because the mismatch is indeed the mismatch we are looking for, or should we ignore the mismatch so we can continue searching for the inequivalency in the rest of the tree?

Here is an algorithm that attempts to address the requirements of the XML equivalency assertion, as noted above.


void AreEquivalent(XElement expected, XElement actual)
{
HashSet<XElement> expectedElements = new HashSet<XElement>(expected.Elements());
HashSet<XElement> actualElements = new HashSet<XElement>(actual.Elements());

if (expectedElements.Count != actualElements.Count)
{
// Report 'actual' as offending element.
throw new Exception();
}

foreach (XElement expectedChild in expectedElements)
{
XElement equivalentChild = actualElements.FindFirstOrDefault(e =>
{
try { AreEquivalent(expectedChild, e); }
catch(Exception) { return false; }
return true;
});

if (equivalentChild == null)
{
// Report 'expectedChild' as offending element.
throw new Exception();
}

actualElements.Remove(equivalentChild);
}
}

This algorithm, hereon mentioned as AreEquvialent, runs in quadratic time (expected for set comparison of two lists), and leverages the optimization that only the first inequivalent element need to be reported. If all inequivalent elements were to be reported, we would need to implement one of the many tree-diff algorithms, all of which have greater runtime complexity.

Reporting Offending Elements

Consider the following XML snippets which are clearly inequivalent due to the presence of an unexpected child element from the root node.


<!-- Expected XML -->
<parent>
<descendant xmlns="ns"/>
<descendant xmlns="ns"/>
<childNode/>
</parent>

<!-- Actual XML -->
<parent>
<bogusNode/>
<descendant xmlns="ns"/>
<descendant xmlns="ns"/>
</parent>

We want to report that bogusNode is the offending element, but at the same time we want to avoid categorizing the following related XML snippets as invalid.


<!-- Expected XML -->
<root>
<parent>
<descendant xmlns="ns"/>
<descendant xmlns="ns"/>
<childNode/>
</parent>
<parent>
<descendant xmlns="ns"/>
<descendant xmlns="ns"/>
<bogusNode/>
</parent>
</root>

<!-- Actual XML -->
<root>
<parent>
<bogusNode/>
<descendant xmlns="ns"/>
<descendant xmlns="ns"/>
</parent>
<parent>
<descendant xmlns="ns"/>
<childNode/>
<descendant xmlns="ns"/>
</parent>
</root>

At a first glance, AreEquvialent solves the problem of reporting bogusNode from the first pair of XML documents. However, although AreEquvialent detects the offending element, it fails to report this element when the recursive calls unwind back the the original caller. If you trace AreEquvialent for the examples above, you will notice that it always reports the root element of the XML as the offending element, for any inequivalent pair of XML documents.

Algorithm Improvement

The problem with AreEquvialent is if the condition that detects the offending element evaluates to true in one stack frame, it will also evaluate to true again in the preceding stack frame, until the recursive call completely unwinds. In other words, reporting that bogusNode is inequivalent also causes its parent to be reported as inequivalent through the same condition – see lines 9, 17, and 24..

I spent a considerable amount of time modifying AreEquvialent in an attempt to address this problem. Unfortunately, many of my attempts failed as they either reintroduced the condition issue in another manner, or failed to address the issue of premature search termination. I ultimately realized that while a single-pass through the XML is desirable, it was simply insufficient for what the problem demanded.

As you may recall, the algorithm for strict equality is a linear-time algorithm that positionally compares elements from two documents. If the elements at a given position are not equal, the algorithm terminates immediately and classifies the documents as not equal. Ideally, AreEquvialent could behave this way if the element ordering of the two documents were similar. So, a new solution to the equivalency problem is as follows.

  • Order the elements of a document such that they match the order of elements in the reference document as best as possible
  • Run a the equality algorithm with the reference and transformed documents
    • The algorithm terminates upon detection of the first inequivalent node, or if the two documents are equal

What does “as best as possible” mean? It means that we need AreEquvialent to do some additional work in attempting to detect candidate inequivalent elements during the canonicalization step. For instance, if the best match for a given element is an inequivalent one that differs only in number of child elements, we would want to reorder this element so that the equality algorithm reports a specific and relevant error: inequivalency due to mismatch of number of child elements, for the specific elements involved.

Here is the algorithm that performs the canonicalization of element order. It runs in quadratic-time and is invoked recursively prior to calling AreEquvialent.


void NormalizeElementOrder(IEnumerable<XElement> expectedChildren, List<XElement> actualChildren)
{
int nextUnorderedElementIndex = 0;
foreach (XElement expectedChild in expectedChildren)
{
int numChildElements = expectedChild.Elements().Count();

Predicate<XElement> isEquivalentToExpectedChild = new Predicate<XElement>(e =>
{
// Similar to AreEquivalent, but without the recursive call to determine child element equivalency.
try { AreElementsEquivalent(expectedChild, e); }
catch (Exception) { return false; }
return true;
});
Predicate<XElement> isEquivalentToExpectedChild_Strict =
e => isEquivalentToExpectedChild(e) &&
numChildElements == candidateElement.Elements().Count();

// Look for the element that best resembles expectedChild considering the configuration
// the assertion, along with the number of child elements for expectedChild.
int equivalentChildIndex = actualChildren.FindIndex(nextUnorderedElementIndex, isEquivalentToExpectedChild_Strict);

if (equivalentChildIndex < 0)
{
// We couldn't find a matching element. Relax the search criteria
// by removing the matching child element count constraint and try again.
equivalentChildIndex = actualChildren.FindIndex(nextUnorderedElementIndex, isEquivalentToExpectedChild);
if (equivalentChildIndex < 0) { return; } // No match found, stop normalizing element order.
}

SwapElement(actualChildren, equivalentChildIndex, nextUnorderedElementIndex);
++nextUnorderedElementIndex;
}
}

Is this the best solution?  I hope so!  Readers, please share your thoughts on this algorithm along with ideas for improving it.

Now that the equivalency algorithm is implemented and tested, I can resume working on the NUnit and VSTS adapters to the core assertion library.  I am hoping to have the relevant sources sources committed within the next week or so.

XML Equality and Equivalency

It’s been a while since I wrote about Jolt.NET, so I thought I’d give a preview of the functionality from the XML assertions I’ve been working on.  I’m currently working on wrapping-up the core assertion functionality, which can ultimately be exposed through test-framework-specific interfaces.

When I attempt to verify XML in unit tests, I usually am performing one of the following tasks.

  • Schema validation
  • Verifying that a tree contains a particular value in a specific location
  • Verifying that a tree contains the same structure and values as another tree

For the purposes of this discussion, I will focus on the third point as it is trivial to accomplish the first two using .NET XML validation and XPath, respectively.  On a side note, Jolt.NET XML assertions will support all three of these operations.

Strict Equality

What does it mean for two XML documents to be equal?  In the strictest sense, we can apply a byte-per-byte comparison of two XML files and determine if all byte pairs match.  Clearly, this doesn’t work in the general case since things like whitespace, character encoding, and attribute ordering, will all impede the success of the equality algorithm.  All of these items can vary greatly in an XML document and still not change the semantics or structure of the XML.

In order for the algorithm to work in the general case, it must treat XML entities (elements, attributes, etc…) as single units and define equality semantics for those units.  Furthermore, XML parser support can eliminate the need for dealing with whitespace, processing instruction elements, comments, and other entities that do not affect the semantics of the XML document.

That said, here is a recursive definition that determines if two XML elements are considered to be equal by value (XML document equality is achieved by comparing root elements for equality).  The algorithm currently used by Jolt.NET XML assertions is based on this definition, and relies on the parsing provided by XmlReader to simplify the implementation.

Elements E and F are equal if and only if:

  • The namespace of E equals the namespace of F
  • The name of E equals the name of F
  • The text node values contained in E are equal to the text node values contained in F, and must be in the same order
  • The set of attributes contained in E equals the set of attributes contained in F
  • The number of child elements of E equals the number of child elements of F
  • For each child element pair (E.child[i], F.child[i]), E.child[i] is equal to F.child[i]

Attributes A and B are equal if and only if:

  • The namespace of A equals the namespace of B
  • The name of A equals the name of B
  • The value of A equals the value of B

You will notice that this algorithm does not discriminate on the position of an element’s attribute.  All that is required is that the attribute in question exists in the set of attributes of the compared element, as defined by the rules for attribute equality.  On the contrary, child element ordering is considered as part of the evaluation since an XML schema [element-sequence] generally implies order.

Sometimes, constraints such as namespaces and element ordering pose too much of a restriction and make XML comparison tedious.  Jolt.NET XML assertions aim to address this issue by allowing the option to relax some of the constraints of the equality algorithm.

Equivalency – Relaxed Equality

In Jolt.NET, equivalency between two XML elements is defined by choosing to relax any combination of a set of constraints.  The following directives are currently supported.

  • For element comparisons, ignore element namespaces
  • For element comparisons, ignore element values (child text nodes)
  • For element comparisons, ignore ordering of child elements
  • For element comparisons, ignore their attributes
  • For attribute comparisons, ignore attribute namespaces

The interesting algorithm from the above list of directives is the one that processes child elements, yet ignores their ordering.  Here is another recursive definition that determines if two sets of child elements are equivalent.  The algorithm currently used by Jolt.NET XML assertions is based on this definition.

Assume that the method bool AreEquivalent(element, element) exists and, denotes if two given elements are equivalent according to the user-specified equivalency directives and constraints, listed above.  Then, two sets of child elements, C and D, are equivalent if and only if:

  • The number of elements in C equals the number of elements in D
  • For each child element C[i], there exists a unique child element D[j] such that AreEquivialent(C[i], D[j]) == true

Note that the term “unique” is important – it denotes that any given element can only be matched once for the entire operation. The use of indexes i and j shows that the ordering of elements is not important and that a search is required to locate the desired element.

So there you have it – the definitions of equality and equivalency for XML.  Do let me know if you think I’ve missed something in my definitions!

Jolt.NET Feature Triage

You may have noticed through the feature RSS feed that I've been busy opening/closing items and shuffling things around.  I've postponed work on some documentation tasks that didn't make much sense for a project of one developer.  Also, I decided that I won't implement the Tuple library as the upcoming release of the .NET Framework 4.0 will contain an implementation similar to what I had planned.  .NET 4.0 is currently available in beta/CTP for download.

The triage slashed the number of items that I wanted to complete for Jolt.NET 0.3, and that prompted me to think about including another feature in the release.  When I first started work on Jolt.NET, I wanted to include a test facility to make assertions on XML data structures.  After releasing Jolt.NET 0.1, I decided not to implement this feature as it would be available in NUnit 3.0.  However, I think this decision was premature as not all testing frameworks will support this feature.

So over the next few days, I will be drafting an implementation and feature set for extending a test framework to support assertions on XML data.  I suspect that I will need some adapter classes to expose the feature in various test frameworks, and at this time, I'm not sure how many of these frameworks to support.

Static Idempotent Functors

The task of replacing inline anonymous methods with a generic functor from the Jolt.Functional library brought forth some interesting issues, all of which have been addressed as of revision #19483.  The main topic that I will discuss in this post deals with the implementation and usage of the Functor.Idempotency and Functor.TrueForAll generic factory methods.

The TrueForAll factory method creates a delegate that returns true for any input, and thus it is natural to implement this method in terms of the Functor.Idempotency method since TrueForAll is a Boolean idempotent predicate.  My main use of TrueForAll in Jolt.NET was in testing the serialization of finite automata to GraphML.  If you’ve used this library, you may recall that there are limitations when serializing of a TransitionPredicate.  Specifically, the predicate must refer to a static method.  So, is it possible to implement a generic idempotent method that is static?  Let’s take a look at the function signature that we need to satisfy, along with the current implementation.

using System;

// One-arg idempotent method; overloads exist for zero through four args.
static Func<T, TResult> Idempotency<T, TResult>(TResult constant)
{
return arg => constant;
}

In this example, the method that ultimately implements the inline expression can not be static since it uses a variable that is not local to the expression (see “Anonymous Methods and Jolt Functors: Behind the Scenes” for more information).  Consequently, the resulting delegate refers to a non-static method.  If the hidden class that implements the expression were static, then we could only ever have one idempotent delegate per unique specialization. per app-domain, as demonstrated in the following example.


using System;

static class IdemptotencyWrapper<TConstant>
{
static TConstant constant;
static TConstant Idempotency<T>(T arg)
{
return constant;
}
}

static Func<T, TResult> Idempotency<T, TResult>(TResult constant)
{
IdempotencyWrapper<TResult>.constant = constant;
return IdempotencyWrapper<TResult>.Idempotency<T>;
}

void UseFunctors()
{
Func<int, bool> falseForAll = Idempotency<int>(false);
Func<int, bool> trueForAll = Idempotency<int>(true); // Oops! Changed the behavior of falseForAll.
}

Another novel attempt at solving this problem is to use an expression tree to force the user-provided constant to be local to the expression.

using System;

static Func<T, TResult> Idempotency<T, TResult>(TResult constant)
{
Expression<Func<T, TResult>> expression = Expression.Lambda(
Expression.Constant(constant, typeof(TResult)),
new[] { Expression.Parameter(typeof(T), arg) });
return expression.Compile();
}

This solution looks good, has some performance deficiencies we can live with, and may even provide the static method that we need.  However, the method that is produced is compiled and created at runtime.  In other words, it doesn’t exist in the assembly and thus not feasible for textual serialization.

I’m also uncertain on whether this method is truly static.  The MethodInfo object representing the delegate’s method states that IsStatic is true, but the delegate’s Target property gives a non-null value.  Very strange.

If we want to persist the method that is represented by the expression, we can utilize the Reflection.Emit API and create an assembly that contains the desired method.  The performance deficiency is even greater with this solution and is overkill for the general usage of the Idempotency method.  While the serialization problem is addressed, the deserialization problem is not; this generated assembly is not distributed with the library and thus the referenced methods may not exist in deserialization environment.

In conclusion, I could not conceive of a way to modify the Idempotency method implementation to meet my criteria, so I decided to keep it as is.  The implementations of TrueForAll and FalseForAll were modified to return constants local to the expressions, thus making the resulting delegate refer to a static method.

Adaptors For Delegate Types

Recently, I’ve been working on creating adaptors for delegates of equivalent types.  Given the generic types Predicate<T> and EventHandler<TEventArgs>, I would like to create a function that transforms instances of these types into Func<T, bool> and Action<object, TEventArgs> respectively, and vice-versa.  This is a fairly straight-forward task to accomplish, but implementing a correct solution depends on what your expectations are with regards to what the resulting delegate references.

My goal is to create a new adaptor delegate that refers to the same method that is referred to by the adaptee.  When implementing the adaptor function for the Predicate<T>/Func<T, bool> case, I implemented the incorrect solution twice.  On the second attempt, I also implemented the incorrect test code, which was very negligent on my part.  It was only until other unrelated tests start failing, that I noticed my error: I had changed the semantics of the implementation and consequently the expectations of the unit tests.

Let’s look at three ways to implement a function that creates an adaptor delegate for Predicate<T>.

Option 1 : Lambda Expressions


using System;

Func<T, bool> CreateFunc<T>(Predicate<T> predicate)
{
return arg => predicate(arg);
}

This solution uses a lambda expression to create a new delegate that forwards its call to the adaptee.  The forwarding operation is implemented by the compiler as a new method on a compiler-generated type.  Clearly, this is does not achieve my goal since the method referred to by the adaptor is different from that referred to by the adaptee.

Option 2 : Delegate Constructor 

using System;

Func<T, bool> CreateFunc<T>(Predicate<T> predicate)
{
return new Func<T, bool>(predicate);
}

This solution initializes an instance of the adaptor delegate with the adaptee instance.  It looks like a copy constructor has been invoked, as the constructor accepts anything that implements the delegate’s signature.  However, don’t be deceived by the copy-constructor-like syntax of this operation.  The constructor has in fact added the adaptee to the adaptor’s invocation list.  Consequently, the method referred to by the adaptor is the Invoke() method of the adaptee.  If the constructor parameter were a method, then the resulting adaptor delegate would refer directly to the given method.

Option 3 : Delegate Factory Method

using System;

Func<T, bool> CreateFunc<T>(Predicate<T> predicate)
{
return Delegate.CreateDelegate(typeof(Func<T, bool>), predicate.Target, predicate.Method) as Func<T, bool>;
}

This solution utilizes a bit of reflection and a static factory method on the delegate class to construct the adaptor.  Notice that the factory method accepts a MethodInfo object, denoting the method that the resulting delegate will directly invoke.  Alas, we have arrived at the solution that implements my expectations.  Even though the adaptor and adaptee delegates are of different types, they both internally refer to the same method.

There is one gotcha that you should be aware of – this may be a bug with in the .NET Framework, or misstated information in the documentation.  If you refer to the documentation on the Delegate.CreateDelegate(Type, MethodInfo) method, you will notice that the method supports both static and instance methods for binding (.NET 2.0 and onwards).  However, disassembling the method in question shows that this is not the case; this overload only functions with static methods, always passing a null reference as the target object of the new delegate.  In my example above, I use another overload that accepts the target parameter to assure that both static and instance methods are adaptable.

Jolt.NET Functor Composition

I recently completed the implementation of the generic Compose class, allowing a caller to build composite functors with up to seven parameters.  In this post, I will discuss the implementation and a minor coding nuisance that I encountered while writing the code.  But first, here is a quick recap the features that were committed since my last post.

  • Jolt.Functional functor creation and manipulation library (see this post, and the docs)
  • Ability to override return type in Jolt.Testing.ProxyTypeBuilder (see this post, and the docs)

Compose Implementation

The Compose class is implemented as a collection of generic static functions: First, Second, Third and Fourth.  Each of these functions binds the execution of a delegate to an argument of another delegate, the position of the argument being denoted by the function’s name. A composite delegate is represented as a new delegate, and its generic implementation takes care of binding arguments and calling the composed delegates at the right time.  Functor composition is very similar to argument binding, except that the bound arguments are now delegates instead of constant values.

The rules and supported scenarios for creating a composite are straightforward:

  • You can bind any System.Func functor F to an argument A of any System.Action or System.Func delegate so long as the return type of F matches the type of A.
  • Only one functor can be bound per Compose method call.
  • Bound functors are executed when the resulting composite is executed, in LIFO order.
    • E.g. Given a binary functor F, nullary functors G and H, and the composite c = Compose.First(Compose.First(F, G), H), executing c() results in the execution of H first, followed by G and F.
  • Creating a composite yielding more than four arguments results in a Jolt.Functional.Action or Jolt.Functional.Func delegate, each of which support five through seven arguments.

When writing the methods of Compose, I was primarily concerned with enforcing the rules from the first bulleted item above at compile-time.  Doing so would ensure that the creation of the composite is as fast as possible since the code would not need to perform checks for possible user errors.  Unfortunately, taking this approach meant that I would need to declare every possible permutation of binding a 1-4 argument delegate with a 0-4 argument functor.  There are a total of 100 of such combinations, and the only difference between any two given functions is generally the number of delegate arguments and the position at which the binding occurs.

To the best of my knowledge, there is no way to use generics to reduce the number of functions that are needed to implement all of the desired combinations.  C# does not allow a generic parameter to be constrained to a System.Delegate type, and if it were possible, there is still no facility to constrain a delegate to one with a given number of arguments.  For this situation, a C++-like variadic template argument or typelist feature is very desirable as it will eliminate the method overloads enabling composition of functors with varying arguments.  To illustrate, one could reduce the number of First method overloads from 40 to 8 using the following variadic generic argument pseudocode.

public static class Compose
{
public Func<params Args, TResult>
First<T, TResult, params Args>(Func<T, TResult> function, Func<params Args, T> innerFunction);
public Func<params Args, TResult>
First<T1, T2, TResult, params Args>(Func<T1, T2, TResult> function, Func<params Args, T1> innerFunction);
public Func<params Args, TResult>
First<T1, T2, T3, TResult, params Args>(Func<T1, T2, T3, TResult> function, Func<params Args, T1> innerFunction);
public Func<params Args, TResult>
First<T1, T2, T3, T4, TResult, params Args>(Func<T1, T2, T3, T4, TResult> function, Func<params Args, T1> innerFunction);

public Action<params Args>
First<T, params Args>(Action<T> function, Func<params Args, T> innerFunction);
public Action<params Args>
First<T1, T2, params Args>(Action<T1, T2> function, Func<params Args, T1> innerFunction);
public Action<params Args>
First<T1, T2, T3, params Args>(Action<T1, T2, T3> function, Func<params Args, T1> innerFunction);
public Action<params Args>
First<T1, T2, T3, T4, params Args>(Action<T1, T2, T3, T4> function, Func<params Args, T1> innerFunction);
}

An alternate implementation technique is to utilize reflection to compose abstract System.Delegate instances.  As mentioned earlier, this solution requires additional code to guard against user errors.  It would also rely heavily on reflection to determine the type of and create an instance of the resulting delegate.  The implementation will not perform as well as the many-overloads solution, but has the benefit of producing far less code to maintain.  It also removes the restriction of delegate types that are usable with the library (System.Action and System.Func), even though this isn’t a bad restriction.

In the end, I chose to avoid the reflection-based approach favoring runtime speed and a strongly-typed interface over smaller code quantity.  The implementations of the Compose methods and their tests are trivial, and if I had to write such a class again, I would likely implement a code generator to emit all of the desired permutations.  Implementing the code generator may not be such a trivial task, but it will likely take far less time than copying and tweaking code, and making sure that the XML doc comments for the new code is correctly specified.

Finally, I should note that the reflection-based approach has the merit of enabling composition/binding support for abstract System.Delegate instances.  We can utilize a method similar to the following to convert a System.Delegate to its equivalent System.Action or System.Func counterpart.  The resulting delegate may then be used with the Jolt.Functional classes.


public static class Convert
{
public static Action ToAction(Delegate function)
{
// TODO: validate input and cast.
return Delegate.CreateDelegate(typeof(Action), function.Method) as Action;
}
}

Anonymous Methods and Jolt Functors: Behind the Scenes

Good day all!  I haven’t written a meaningful article for some time now as I’ve been fairly occupied with work tasks during the past few weeks.  I’ve been dabbling with an implementation of the Jolt.Functional library, which includes some C++-like functor manipulators (e.g. std::bind, functor composition, among others), and I thought it would be diligent to describe the motivation behind this tasks by showing what anonymous methods and lambda expressions really look like in MSIL.

For the purposes of this article, I will be analyzing the output of the C# 3.0 compiler.  Your .NET compiler may output something different, but at the time of writing this I suspect that the output will be similar as there are no native MSIL constructs to represent an anonymous method (and as you may see below, in all likelihood there is no reason to support such constructs).

MSIL Representation of Anonymous Methods

The C# 3.0 anonymous method syntax provides a convenient way for inlining method declarations within the body of other methods.  To declare an anonymous method, you can use either the delegate keyword or the less cumbersome lambda expression syntax.  The latter case is preferred as the compiler can generally infer the types of the method arguments from the usage of the expression.  Here is an example using both syntaxes to declare a predicate that determines if a given integer is equal to 100.



class FunctorExploration
{
void CreateFunctors()
{
Predicate<int> isOneHundred = value => value == 100;
Predicate<int> is100 = delegate(int value) { return value == 100; };
}
}

Both of the given syntaxes present the illusion that an inline method is created with local method scope.  However, since MSIL has no notion of an anonymous or inline method declaration, the compiler’s transformation of this source results in MSIL that is semantically different.  As a software developer, it is important to be aware of these differences as they may affect expected performance.  Here is the MSIL representation of the CreateFunctors() method, with only one of the predicate declarations.



.method private hidebysig instance void CreateFunctors() cil managed
{
.maxstack 3
.locals init (
[0] class [mscorlib]System.Predicate`1<int32> isOneHundred)
L_0000: nop
L_0001: ldsfld class mscorlib]System.Predicate`1<int32> FunctorExploration.FunctorExploration::CS$<>9__CachedAnonymousMethodDelegate1
L_0006: brtrue.s L_001b
L_0008: ldnull
L_0009: ldftn bool FunctorExploration.FunctorExploration::<createfunctors>b__0(int32)
L_000f: newobj instance void [mscorlib]System.Predicate`1<int32>::.ctor(object, native int)
L_0014: stsfld class [mscorlib]System.Predicate`1<int32> FunctorExploration.FunctorExploration::CS$<>9__CachedAnonymousMethodDelegate1
L_0019: br.s L_001b
L_001b: ldsfld class [mscorlib]System.Predicate`1<int32> FunctorExploration.FunctorExploration::CS$<>9__CachedAnonymousMethodDelegate1
L_0020: stloc.0
L_0021: ret
}

This function first attempts to load the value of the static field CS$<>9__CachedAnonymousMethodDelegate1 on to the evaluation stack.  If the value is null, then the static field is initialized to a delegate that refers to our isOneHundred predicate function.  The method referred to by the delegate is given as FunctorExploration::b__0(int32), which is a compiler-generated method that contains our predicate implementation.  Once the cache look-up is completed, the value of the static field is assigned to the local variable isOneHundred.  Here is an abbreviated version of the MSIL that declares the FunctorExploration class.



.class private auto ansi beforefieldinit FunctorExploration
extends [mscorlib]System.Object
{
.method private hidebysig static bool <createfunctors>b__0(int32 'value') cil managed
{
.custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor()
.maxstack 2
.locals init (
[0] bool CS$1$0000)
L_0000: ldarg.0
L_0001: ldc.i4.s 100
L_0003: ceq
L_0005: stloc.0
L_0006: br.s L_0008
L_0008: ldloc.0
L_0009: ret
}

.method private hidebysig instance void CreateFunctors() cil managed { }

.field private static class [mscorlib]System.Predicate`1<int32> CS$<>9__CachedAnonymousMethodDelegate1
{
.custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor()
}
}

The purpose of the cached delegate is to make sure that there is only ever one instance of it in your code.  Whenever you pass the delegate around, you will always refer to the instance held in the static memory of the class.  Furthermore, there is no way to access this instance without using runtime reflection; the compiler and IDE hide all of the generated code from you.

What happens if a method contains two or more anonymous method declarations?  If you surmised that the generated code depicted above is appropriately emitted for each declaration, then you are correct.  What is interesting is that if two anonymous methods are functionally equivalent (as in the first example), then the compiler will duplicate the method implementation, static field, and code to initialize the static field.  Here is an abbreviated version of the MSIL that declares the FunctorExploration class from the initial example containing two predicates.



.class private auto ansi beforefieldinit FunctorExploration
extends [mscorlib]System.Object
{
.method private hidebysig static bool <createfunctors>b__0(int32 'value') cil managed
{
.custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor()
.maxstack 2
.locals init (
[0] bool CS$1$0000)
L_0000: ldarg.0
L_0001: ldc.i4.s 100
L_0003: ceq
L_0005: stloc.0
L_0006: br.s L_0008
L_0008: ldloc.0
L_0009: ret
}

.method private hidebysig static bool <createfunctors>b__1(int32 'value') cil managed
{
.custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor()
.maxstack 2
.locals init (
[0] bool CS$1$0000)
L_0000: nop
L_0001: ldarg.0
L_0002: ldc.i4.s 100
L_0004: ceq
L_0006: stloc.0
L_0007: br.s L_0009
L_0009: ldloc.0
L_000a: ret
}

.method private hidebysig instance void CreateFunctors() cil managed
{
.maxstack 3
.locals init (
[0] class [mscorlib]System.Predicate`1<int32> isOneHundred,
[1] class [mscorlib]System.Predicate`1<int32> is100)
L_0000: nop
L_0001: ldsfld class [mscorlib]System.Predicate`1<int32> FunctorExploration.FunctorExploration::CS$<>9__CachedAnonymousMethodDelegate2
L_0006: brtrue.s L_001b
L_0008: ldnull
L_0009: ldftn bool FunctorExploration.FunctorExploration::<createfunctors>b__0(int32)
L_000f: newobj instance void [mscorlib]System.Predicate`1<int32>::.ctor(object, native int)
L_0014: stsfld class [mscorlib]System.Predicate`1<int32> FunctorExploration.FunctorExploration::CS$<>9__CachedAnonymousMethodDelegate2
L_0019: br.s L_001b
L_001b: ldsfld class [mscorlib]System.Predicate`1<int32> FunctorExploration.FunctorExploration::CS$<>9__CachedAnonymousMethodDelegate2
L_0020: stloc.0
L_0021: ldsfld class [mscorlib]System.Predicate`1<int32> FunctorExploration.FunctorExploration::CS$<>9__CachedAnonymousMethodDelegate3
L_0026: brtrue.s L_003b
L_0028: ldnull
L_0029: ldftn bool FunctorExploration.FunctorExploration::<createfunctors>b__1(int32)
L_002f: newobj instance void [mscorlib]System.Predicate`1<int32>::.ctor(object, native int)
L_0034: stsfld class [mscorlib]System.Predicate`1<int32> FunctorExploration.FunctorExploration::CS$<>9__CachedAnonymousMethodDelegate3
L_0039: br.s L_003b
L_003b: ldsfld class [mscorlib]System.Predicate`1<int32> FunctorExploration.FunctorExploration::CS$<>9__CachedAnonymousMethodDelegate3
L_0040: stloc.1
L_0041: ret
}

.field private static class [mscorlib]System.Predicate`1<int32> CS$<>9__CachedAnonymousMethodDelegate2
{
.custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor()
}

.field private static class [mscorlib]System.Predicate`1<int32> CS$<>9__CachedAnonymousMethodDelegate3
{
.custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor()
}
}

Since anonymous methods are local to a function at design time, they may also access other local variables that share the same scope.  Let us generalize our functor to test for equality as follows.



class FunctorExploration
{
void CreateEqualityPredicate<TValue>(TValue predicateConstant)
{
Predicate<TValue> equals = value => value.Equals(predicateConstant);
}
}

When the compiler encounters this type of anonymous method declaration, it emits code that is substantially different from the previously shown static method cases.  Here is the MSIL representation of the CreateEqualityPredicate() method.



.method private hidebysig instance void CreateEqualityPredicate<TValue>(!!TValue predicateConstant) cil managed
{
.maxstack 3
.locals init (
[0] class [mscorlib]System.Predicate`1<!!TValue> equals,
[1] class FunctorExploration.FunctorExploration/<>c__DisplayClass1<!!TValue> CS$<>8__locals2)
L_0000: newobj instance void FunctorExploration.FunctorExploration/<>c__DisplayClass1<!!TValue>::.ctor()
L_0005: stloc.1
L_0006: ldloc.1
L_0007: ldarg.1
L_0008: stfld !0 FunctorExploration.FunctorExploration/<>c__DisplayClass1<!!TValue>::predicateConstant
L_000d: nop
L_000e: ldloc.1
L_000f: ldftn instance bool FunctorExploration.FunctorExploration/<>c__DisplayClass1<!!TValue>::<CreateEqualityPredicate>b__0(!0)
L_0015: newobj instance void [mscorlib]System.Predicate`1<!!TValue>::.ctor(object, native int)
L_001a: stloc.0
L_001b: nop
L_001c: ret
}

This method first instantiates the compiler generated type FunctorExploration/<>c__DisplayClass1<!!TValue> (a nested type), and then proceeds to initialize a field on this type instance with the value of the local variable of interest.  The equals predicate is then initialized from a compiler generated method given as <>c__DisplayClass1<!!TValue>::<CreateEqualityPredicate>b__0(!TValue 'value').  Here is an abbreviated version of the MSIL that represents the compiler-generated type.



.class auto ansi sealed nested private beforefieldinit <>c__DisplayClass1<TValue>
extends [mscorlib]System.Object
{
.method public hidebysig instance bool <CreateEqualityPredicate>b__0(!TValue 'value') cil managed
{
.maxstack 2
.locals init (
[0] bool CS$1$0000)
L_0000: ldarga.s 'value'
L_0002: ldarg.0
L_0003: ldfld !0 FunctorExploration.FunctorExploration/<>c__DisplayClass1<!TValue>::predicateConstant
L_0008: box !TValue
L_000d: constrained !TValue
L_0013: callvirt instance bool [mscorlib]System.Object::Equals(object)
L_0018: stloc.0
L_0019: br.s L_001b
L_001b: ldloc.0
L_001c: ret
}

.field public !TValue predicateConstant
}

The new type is created to give the anonymous method implementation its own reference to the local variable of interest.  Whenever CreateEqualityPredicate() is invoked, a new instance of the type is returned, initialized with whatever value is given as a parameter to our function.  The delegate can no longer be static since we are dealing with instance data.

What happens when we declare multiple delegates in this fashion?  As in the corresponding example, a new type will be created for each anonymous method that uses a local variable from an accessible sibling scope.  Again, code duplication arises if there are two anonymous method declarations that are equivalent in functionality.  Also, as mentioned before, there is no way to access the generated entities at design time without using runtime reflection.

Jolt.Functional Motivation

The introduction of Linq in the .NET Framework 3.5 has made anonymous methods and important language feature.  Without them, it is very tedious to define predicates in your code using function declaration syntax, then refer to them in your Linq filter criterion.  However, the advantage taking such an approach is that one is more likely to notice patterns in the predicate code and thus have the opportunity to refactor for code reuse.  This is not possible with compiler-generated code, and each time you introduce an anonymous method, you increase the amount of code duplication that may occur.

The Jolt.Functional library aims to minimize the amount of compiler-generated code by promoting functor reuse through the use of generic delegate declarations for commonly-used delegates, along with generic delegate factory methods for manipulating existing functions.  Here are some examples of these types of constructs.


  • Creation of idempotent functions (returning a constant value for any input)
  • Predefined idempotent functions (true for all, false for all)
  • Parameter binding and function composition
  • Predicate<T> adaptor for Func<T, bool> (and vice versa)
  • Action<T> adaptor for Func<T, TResult>

To illustrate the benefit of such constructs, consider the generic CreateEqualityPredicate() method from a previous example.  This method creates a predicate for each invocation.  However since its implementation is generic, the compiler has only generated one implementation of the predicate method and supporting types. Compare this with inlining many lambda expressions that test for various types of equality.  The functionality of the code is the same, but the MSIL and assembly size has grown dramatically!

For those concerned about assembly code bloat from anonymous methods, the Jolt.Functional library will help by reducing the amount duplicate compiler-generated code in your program.  Also, if your language does not support anonymous methods, the Jolt.Functional library can help you organize your method declarations so that they may be combined and reused to form more complicated expressions.

Wiki Documentation Formatting

Please bear with the poor formatting in the wiki documentation for Jolt.NET.  Sometime within the past few weeks, the CodePlex wiki parser was changed and consequenly any inline formatting is being incorrectly parsed.  This is causing certain style formats to run through to the end of the document, and other inline code formatting to be applied on a new line.

These change are out of my control and hopefully this will be resolved very quickly!

Source Code Reorganization – Part One

I just completed the first batch of source code reorganization on the Jolt 0.3 work item list and committed it to revision #18919.  All code pertaining to the implementation and support of the FiniteStateMachine (FSM) class is now in the the Jolt.Automata assembly, and similarly, all FSM-related test code has moved to the Jolt.Automata.Test assembly.  The following notes summarize all the breaking changes that were introduced as part of moving the source code.

  • All FSM-related types that were previously in the Jolt namespace are now in the Jolt.Automata namespace
  • No types have been renamed

Please refer to “Jolt.NET Restructuring and Future Features” for an overview of the source-code reorganization tasks.

Jolt.NET 0.2 Release

Good day!

This morning, I committed source revision #18896 containing the final feature work for the Jolt.NET 0.2 release.  This work was completed last week, but I didn't get a chance to upload the changes as I was in a rush to catch a flight for a trip.  During my travels, I had little time to access the internet and prepare the release documentation, so I had to wait until my return home.  Sorry for the delay!

Jolt.NET 0.2 contains the following new features:

There are also some other maintenance-related changes included in the release, all of which can be viewed on the work item page.  Please visit the release page for download options.

Tomorrow, I will be creating the initial set of work items for the Jolt.NET 0.3 release, most of which were discussed in my previous post.

Jolt.NET Restructuring and Future Features

During my regular day-job and while I work on the Jolt.NET library, I regularly take notes as to what to include in future library releases.  Sometimes, I post the notes immediately to the project site in the form of a work item.  In this case, the work item will usually contain information about a required change in design of an existing feature, or some planned analysis that assures intended feature functionality.  When I don’t post my notes, it is usually because they refer to an incomplete feature idea (i.e. how the feature should work), and I don’t post them so that I can keep focused on working to complete the currently planned release.

This post will cover some of the upcoming new features for Jolt.NET as well as some needed library reorganization.  You can expect future posts with this theme as the active implementation of a pending release nears completion.

New Assemblies

The Jolt assembly is designed to be the analog of the mscorlib.dll assembly from the .NET Framework.  It contains the commonly used types that generally don’t fit anywhere else in the library, and also serves as a core assembly to be used by others within the library.  For simplicity and ease of use, I’d like Jolt.dll to be free of both internal and external dependencies, for violating this guideline means that any assembly linking to Jolt.dll incurs extra baggage which it may or may not use.  Consequently, the FiniteStateMachine (FSM) class (and its helpers) must be moved to a new assembly as they incur a dependency to QuickGraph.

A similar approach must also be taken when implementing the features that enable translation of an FSM to MSAGL and GLEE graphs since the MSAGL and GLEE assemblies are generally not redistributable.  QuickGraph provides separate assemblies that enable use of these frameworks, avoiding runtime errors in the core library when the frameworks are not present.  I plan to adopt a similar approach for the new Jolt FSM assembly.  Also, you will note that the work items for the FSM translation features have been moved out of the Jolt 0.2 release as I want to perform all of the reorganization at the same time.

Functors

This is a task I’ve been eager to take on for some time now, and think it will add great value to the Jolt.NET library.  The goal is to provide predefined functors that eliminate the need for redefining the same kinds of anonymous delegate (or lambda expressions) over and over.  Redefinition, albeit very compact in syntax, often leads to code bloat in the assembly in the form of non-compact IL.  To illustrate, consider the following code that uses two identical lambda expressions to represent a predicate.



using System;

void CreateFunctors()
{
Predicate<int> isEven = x => x % 2 == 0;
Func<int,bool> isNotOdd = x => x % 2 == 0;
}

The resulting IL for this function will yield two additional methods, both identical in functionality and representing each lambda expression.  If you use an optimizing IL compiler, you may be in luck as the compiler might generate the desired compact version of the code.  I’ll discuss this problem further in a future blog post.

Which functors will help reduce code bloat?  I’ve noted the following, which are used extensively in the Jolt libraries, and can be represented as either constructed or open constructed forms of the Action<> and Func<> delegate variants.

  • Idempotency function: f(x) = c, for all x and a predefined constant c
    • Example: f(x) = true, for all x
  • Identity function: f(x) = x, for all x
  • No-op function: f(x) = void; for all x

In addition to these predefined functors, the following utilities will assist reusing existing delegates to create new ones.

  • Adaptor for Func<>: ignores return value making it compatible with Action<>
  • Functor composition: given functions f(x), and g(y), compose a new function h(y) = f(g(y))
  • Parameter binding: given function f(x, y), bind the first parameter of the function to a constant c, creating a new function g(y) = f(c, y)
    • Similarly for the second parameter

Tuples

The goal of a tuple is to provide for a light-weight read/write generic container that defines only fields (i.e. no methods).  Instead of explicitly defining a new type each time you need to store a collection of fields, a tuple allows you to specialize it with the desired field types, creating the desired container.  For example, Tuple<int, string> is a type containing two fields of types int and string, respectively.

For those familiar with anonymous types, tuples may seem redundant .  However, there are some important differences between an anonymous type and a tuple.

  • The type of the tuple is known at design time
  • The fields of a tuple are writable
  • A tuple implements value-based equality semantics

The tuple implementation will be adapted from the Boost Tuple Library.

Xml Doc Comment Transformations

The Jolt.Testing library now supports creating XML doc comments for a generated proxy and interface type, as well as aggregating such XML for a generated assembly.  The code is made available by revision 18785, and usage examples are posted in the library documentation.

The implementation of this feature utilizes the XmlDocCommentReader class detailed in my previous post, and also provides configuration to turn the feature on and off.  When creating the XML doc comments for the new types, a simple transformation is performed on the existing XML, replacing the type name of the real subject type with that of the proxy or interface.  For simplicity, all other XML data  participating in the transformation is left untouched.  The following snippet depicts this process.





mscorlib




Provides static methods for the creation, copying, deletion, moving, and opening of files, and aids in the creation of objects.

1







Jolt.GeneratedCode




Provides static methods for the creation, copying, deletion, moving, and opening of files, and aids in the creation of objects.

1



Provides static methods for the creation, copying, deletion, moving, and opening of files, and aids in the creation of objects.

1




Note that the resulting documentation may contain text that is specific to the real subject type, and not very applicable to the proxy or interface types.  This is a minor inconvenience and I believe that many developers will simply ignore it, if not even notice it all.  The big win in implementing this feature is getting parameter information for proxy and interface functions, and Intellisense in general within an IDE.

ProxyTypeBuilder Usability Issue

While working on implementing utilities to parse and read an XML doc comment file, I came across the need to abstract the file system for the purpose of testing.  This was a perfect time to use the Jolt.Testing library and generate an assembly with a proxy and interface to the static System.IO.File class!  My intent was to write the following code.

using Rhino.Mocks;
using Jolt.Testing.Generated.System.IO;
using System.IO;

[Test]
void FileInteractions()
{
With.Mocks(delegate
{
IFile fileProxy = Mocker.Current.CreateMock<IFile>();

Stream fileContents = new MemoryStream(/* stream data */);
Expect.Call(fileProxy.Open("filename")).Return(fileContents);
Mocker.Current.ReplayAll();

FileReader reader = new FileReader(fileProxy);
type.DoWork("filename");
});
}

This is the example I use in my documentation (creating the IFile interface), and was the prime motivation for creating the Jolt.Testing library.  In this test, the file system is never accessed and the IO operation (Open) happens in memory.  However, there is a fundamental problem with this code – it won’t compile!  Can you see why?

When I saw the compilation error, my heart sank.  The Open method returns a FileStream object, not a Stream object (the downcast is not guaranteed to work)!  All of a sudden I realized that generating an abstraction to the System.IO.File class accomplished very little since using the Open method still requires that you use the file system.  Sure, I could use the OpenText method which returns a StreamReader that is wired to my memory stream (and I ultimately did for this test), but there is still a fundamental problem with the usability of the ProxyTypeBuilder outputs:

A generated interface contains method signatures with concrete types that are intended to be abstracted by the interface.

If you think about it, you would never design a base class to a hierarchy and make references to the concrete types within the base class.  Doing so defeats the purpose of creating the base class all together!  The same rationale applies to an interface that generalizes some kind of functionality.  Recall the wisdom from the GoF: when possible, code against an abstraction, not a concrete implementation.

At this point, I thought my implementation was all for naught.  When the shock subsided, I realized that there is flexible solution to the problem, made easy-to-implement by the modularization and design of the ProxyTypeBuilder class.  The idea to the fix is as follows.  For the ProxyTypeBuilder methods AddMethod and AddProperty (entities that return a value), I will need to add an overload that accepts an override type for the return value.  The functions will validate to the make sure that the given type is indeed a base type of the overridden type prior to generating any code.  When the generated code is executed and the real subject type returns its object, the proxy will return a reference to the object’s base type, which hides the concrete type, and everything works as expected.

The pros to this solution:


  • Option to override is flexible, avoiding the incorrect auto-conversion approach.
  • It makes sense to change the return value of FileStream to Stream, but it doesn’t make sense to change DateTime to ValueType.
  • Solution extends easily to XML configuration of ProxyAssemblyBuilder.

The cons to this solution:


  • Xml documentation for the return value may be incorrect; it may still refer to a concrete type after overriding.
  • Generated assemblies for the same type are generally incompatible; you can’t swap one with another as the signature of some methods may be different.
  • Developers may have needed a real reference to the concrete type, and so a downcast is required.
  • Can’t apply the same idea to method parameters since functions called from the parameters may not exist in the base type.

Apart from the parameter issue that requires investigation, my feeling is that the cons to the solution are negligible.  In any case, do let me know what you think.  I plan to start work on this feature after the Jolt 0.2 release.

More on XML Documentation Comments

I recently posted about some of the challenges involved with implementing an algorithm that will locate an XML doc comment file, given a reference to a System.Reflection.Assembly object.  Today, I would like to discuss the extent of how this feature will be used and exposed in the Jolt library.

Initially, I had planned to keep the XML doc comment parsing internal and expose it as a feature of the ProxyTypeBuilder class.  As described in work item #95, copying existing XML doc comments to those of the generated interface and proxy type will provide richer Intellisense.  However, while designing the classes that implement this feature, I noticed that they would be better suited as public types as they may be used to solve many XML doc comment parsing tasks.  Consequently, the following use cases will be supported.

  • Inferring the location of an XML doc comment file from a reference to a System.Reflection.Assembly object.
  • Obtaining the XML doc comments for a given metadata type (i.e. System.Type, System.Reflection.MethodInfo, System.Reflection.ConstructorInfo, etc…).
  • Converting a given metadata type into its XML doc comment string representation.

The code example in the previous post on this topic gives the syntax example for the first two bullets.  The conversion task allows one to take a MethodInfo type for a method such as void Namespace.MyType.MyMethod<U>(out T t, U u[]) and turn it into the string “M:Namespace.MyType`0.MyMethod(`0,``0[])”, which is the key into the XML doc comment data for the corresponding method.

If you search the web for articles relating to reading or parsing an XML doc comment file, you will see many requests for knowledge on how to transform a .NET metadata type into the corresponding XML documentation.  The proposed XmlDocCommentReader class will make this very easy to do.  The conversion functions are geared more towards lower-level development, and they may be used to support the implementation of an XSLT or various XPath queries.

These features are currently under development and will be ready within a few days (of development time).  Currently, I’m focusing on the testing of all of the possible permutations of generic/array/pointer/ref types that change the resulting key into the XML doc comment member list.  There are plenty of permutations, and a lot more than I had initially expected!

GraphML Serialization for FSMs

As of revision 18123, it is now possible to serialize an FSM to GraphML and perform the inverse deserialization operation.  However, there are some caveats with the current implementation that I will discuss below.  In the mean time, if you would like to review how to use this feature, please refer to the FSM documentation.

Copying Data and Intermediate Types

When serializing to GraphML, any serilizable data needs to be represented as a primitive type (a numerical or string type).  Furthermore, in order for a property value to be serialized, the QuickGraph GraphML serialization facility requires that the property be decorated with XmlAttribute.  Unfortunately, implementing support for GraphML serialization wasn't as easy as just decorating some properties with XmlAttribute since properties of each state are held in the FiniteStateMachine class (start/final state markers).

To address this issue, I created two intermediate types: one to hold all serializable transition data (GraphMLTransition), and the other to hold all serializable state data (GraphMLState).  In essence, each of these types represents the <node> and <edge> GraphML elements.  Each time the FSM is serialized, it is copied into a new graph of the intermediate serializable types, and then the new graph is serialized.  Similarly, when the GraphML is deserialized, the resulting graph is copied into a new FSM.

I could have avoided creating these extra types and implementing the copy process, but to do so would have cluttered and complicated the overall FSM implementation; XML-serializable properties must be public!  For states, an actual state class is needed containing the start/final state flags (in essence, the GraphMLState class).  Storing start/final state flags on a state class creates a very sparse data set which I wanted to avoid (there is only ever one start state, and generally very few final states), and consequently I chose to maintain the string representation for the state.  For transitions, each delegate needs to be represented as a string implying a new read/write property.  This new property doesn't make sense as it is unnatural for a user to set a delegate/event using a string representation.

Serializing Delegates (and events)

The TransitionPredicate property and OnStateTransition event are both delegates, which are difficult to serialize/deserialize to/from XML.  You can't expect a delegate to be serialized using XmlSerializer becuase an XML-serializable type requires a parameterless constructor.  So to accomplish this, delegate-to-string-to-delegate conversion code is required.

A delegate may reference either a static or instance method, and in order to deserialize an instance method, the serializer needs to reconstruct the object state that owns the method.  In otherwords, the method's parent object needs to be serialized as well.  This does not create a user friendly scenario for creating an FSM via XML, so I prohibited this type of method from being serialized all together.  Binary serialization is better suited for this task, which is a feature that will be implemented in the future.

So, for delegate serialization to work, your delegate must reference a static method.  When serialized, it will have the following form: methodName;assemblyQualifiedDeclaringTypeName.  An example of a serialized delegate for the [System.Char.IsDigit] method is: IsDigit;System.Char, mscorlib, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089.

During deserialization, if the method is discovered to be invalid (i.e. it does not exist, the signature isn't that of a predicate, etc...) then it will be replaced with a default predicate.  The default predicate returns false for any input value. 

Events pose a challenging conversion task because an event is really a multicast delegate in disguise.  Each method subscribed to the event will need to be validated and serialized as descibed above.  Furthermore, this will be a reflection-heavy task since the delegate is stored in a compiler-generated field.  Currently, I have avoided implementing this feature and will consider its implementation in the future.