Invited signatures |
Expression trees in C# 3.0
Published Jan. 06, 2007
|
1. IntroductionThe end of the year has been busier than expected, and that has been the main reason behind the delay of this promised second installment. But, at last, here it is! In our previous issue [4], we introduced C# 3.0’s lambda expressions and discussed how they have two different facets of representation and usage, tightly related between them: as code (in the form of anonymous methods, executable blocks of executable ILAsm code) and as data (in the form of expression trees, data structures capable of representing in an efficient manner the structure and, in consequence, the algorithm of evaluation of the expression). Then we placed more emphasis in the first facet, and now it is time to concentrate on expression trees.
2. Expression trees as a representation of expressionsAs we have seen before, lambda expressions are compiled as code or data depending on the context where they are used. For instance, if we assign a lambda expression (for this example we have used an expression that implements the well-known formula for calculating the area of a circle) to a variable of a delegate type, like this: Func<double, double> circleArea = (radius) => Math.PI * radius * radius; the compiler will generate inline the corresponding ILAsm code, so that the previous definition is functionally equivalent to the following anonymous method assignment: Func<double, double> circleArea = delegate (double radius) { return Math.PI * radius * radius; };
On the other hand, if we assign the lambda expression to a variable of the generic type Expression<T>, the compiler won’t translate it into executable code, but will instead generate an in-memory tree of objects that represents the structure of the expression itself. These data structures are known in C# 3.0 as expression trees. Continuing with the same example, if we use the lambda expression that calculates the area of a circle like this: static Expression<Func<double, double>> circleAreaExpr = (radius) => Math.PI * radius * radius; What we are expressing translates roughly into the following sequence of assignments: static ParameterExpression parm = Expression.Parameter(typeof(double), "radius"); static Expression<Func<double, double>> circleAreaExpr = Expression.Lambda<Func<double, double>>( Expression.Multiply( Expression.Constant(Math.PI), Expression.Multiply( parm, parm)), parm);
The first of the previous assignments creates an object of the ParameterExpression class, which represents the only parameter (variable, in mathematical terms) of the lambda expression (function, again using the common jargon of mathematics). The second sentence is where the expression tree is really built, using the object-parameter obtained in the previous step. Observe here the functional programming style used when defining an expression tree through code – a style that will become much more common with the release of C# 3.0 and LINQ (particularly LINQ To XML, the LINQ extension for the treatment of XML documents). Once you have built an expression tree, it can be manipulated in the same way as any other .NET object – modified, serialized for persistent storage or transmission through the network, etc. Specifically, the Expression<T> class offers the means for compiling an expression tree “on the fly” into the ILAsm code necessary for its evaluation: Func<double, double> area = circleAreaExpr.Compile(); Console.WriteLine("The area of a circle with radius 5 is " + area(5));
3. The expressions class hierarchyThe library classes that are needed in order to work with expression trees are implemented in the assembly System.Query.dll, and their namespace is System.Expressions (this can change as the product matures). The first thing to note is that there are two classes named Expression: the generic one we have used before, and another non-generic one, on which the previous one relies and that constitutes the real “power horse” that supplies all the machinery of representation of expressions. The generic class is of higher level, and imposes the strong type checking mechanism needed, for instance, in order to compile an expression tree into an anonymous method. The other one is a bit more type-relaxed, and relies more on features like reflection in order to allow a higher freedom of implementation. Normally, an object of the first class will be built from an instance of the second. The non-generic version of Expression is similar to the typical classes obtained when implementing recursive data structures in object oriented programming languages. It’s an abstract class, and from it derives a whole bunch of other abstract as well as “concrete” classes used to represent the different types of elements that can appear in an expression. Some of the most common descendants of Expression are shown in the following table.
The most important thing to note about these subclasses of Expression is that their constructors aren’t public; for this reason, in order to instantiate them it is necessary to use the static factory methods included in the Expression class, as shows the sentence in the sample code that builds the tree corresponding to the area of the circle. These factory methods, of course, include one or more parameters of type Expression, in order to allow the recursive nesting of expressions. For another, more complex example, see how we could create in code the expression tree corresponding to the hypotenuse of the right triangle introduced in our previous installment: static ParameterExpression px = Expression.Parameter(typeof(double), "x"); static ParameterExpression py = Expression.Parameter(typeof(double), "y"); static ParameterExpression[] parms = { px, py }; static Expression<Func<double, double, double>> hypotenuseExpr2 = Expression.Lambda<Func<double, double, double>>( Expression.Call( typeof(Math).GetMethod("Sqrt"), null, new Expression[] { Expression.Add( Expression.Multiply(px, px), Expression.Multiply(py, py)) }), parms); static void Main(string[] args) { Console.WriteLine(hypotenuseExpr2); // prints '(x, y) => Sqrt(Add(Multiply(x, x), Multiply(y, y)))' Func<double, double, double> hypo = hypotenuseExpr2.Compile(); Console.WriteLine("Hypotenuse(3, 4) = " + hypo(3, 4)); }
4. The perfect exampleIn my humble opinion, the perfect example in order to show the possibilities that C# 3.0 expression trees offer consists in the development of an interpreter of mathematical expressions. An expression interpreter is an application that accepts from the user a string that represents a mathematical expression (such as 2 * sin(x) * cos(x), for instance) and translates it into an internal representation that facilitates the further evaluation of the expression for different values of the variables (x in this case). An expression interpreter based in C# 3.0 expression trees would use expression trees for the internal representation of the mathematical expressions. This article does not include an implementation of such an interpreter, just because of the fact that somebody has done it before me. While I was concentrated in other tasks (much less interesting, rest assured :-), MVP colleague Bart De Smet [5] has provided an excellent implementation in a series of blog entries, which can serve as a complementary reading to this article and I sincerely recommend.
5. ConclusionsExpression trees are another important new feature to be included in the next 3.0 version of C#. In this article we have shown how expression trees provide for an efficient data representation of lambda expressions (functions, ultimately), y how these trees can be transformed into code and evaluated when necessary. In a future installment we will describe how this possibility is exploited in LINQ (and specifically in LINQ To SQL) in order to postpone the evaluation of a LINQ query expression until the precise moment when all the information needed for its optimal execution is available. The source code of the example can be downloaded from this site. In order to run it, the May 2006 LINQ Preview, available at [1], must be downloaded and installed.
6. References
|
Sample code (ZIP): |
File
with sample code: octavio_Arboles.zip - 10.70 KB
|