Dynamic language bashing and other topics by non admin authors.

2009-06-17

C# giveth, BCL taketh away

The best feature of C# is the tracking of assignment to variables. The compiler marks as error reading from variables that weren't previously assigned to. Unfortunately, the design of the structs of the .Net library, AKA the Base Class Library, hinders the language in this matter.

Even in trivial code, assignment tracking can spot subtle bugs:

string Last(IEnumerable elems)
{
string result;
foreach (var e in elems)
{
result = e;
}
return result; // Use of unassigned local variable 'result'
}

The result variable is assigned only inside the loop. If the sequence is empty the loop doesn't execute and result is never assigned, as the compiler correctly points out.

It's undecidable to exactly determine at compile time if an expression uses unassigned variables. Instead, the C# specification defines at every program point, propagating information from assignment statements through all control flow statement of a function, whether variables are “definitely assigned.” This conservative stance errs on the side of safety, forbidding all unsafe code but also some safe code.

Compilers for languages that don't mandate similar control flow checking sometimes also bark at the use of unassigned variables. The unconstrained implementations check at a certain level of detail, for some subset of the complete language, even depending on unrelated compiler settings. For example, gcc does flow analysis only when optimization is enabled.

On the other hand, the predictable, dependable, and very thorough analysis of C# considers try/catch, goto, and (the sweetest thing) even struct. All fields of struct variables are tracked independently. The assigned part of the struct can be read without problems while the other part and the whole struct itself are still considered unassigned.

The compiler catches that a dimension was missed while mirroring a point:

struct MyPoint
{
public int x;
public int y;
}

MyPoint Opposite(MyPoint p)
{
MyPoint result;
result.x = -p.x;
return result; // Use of unassigned local variable 'result'
}

The error message should mention that the y field is the one unassigned, but it’s terrific as it stands. I already asked for a better error message at connect.microsoft.com.

The integration with struct is useful for lightweight named parameter emulation and other uses where a named bundle of variables make sense. Alas, the structs predefined in the .Net library put this niceties out of reach. Take System.Windows.Point:

public struct Point
{
double _x;
double _y;
public double X
{
get
{
return this._x;
}
set
{
this._x = value;
}
}
public double Y
{
get
{
return this._y;
}
set
{
this._y = value;
}
}
}

The fields are private and must be accessed through the properties. The adaptation of the Opposite() function seen before encounters compile errors whose fixing also masks the error we wanted the compiler to give us:

Point Opposite(Point p)
{
Point result;
result.X = -p.X; // Use of unassigned local variable 'result'
return result;
}

Because result.X is a property and not a field, setting it amounts to calling an instance method. Of course, instance methods require all arguments, including the implicit this, to be assigned, so result.X can't be written to unless result is initialized when declared:

  Point result = new Point()

But the constructor also initializes result._y so our original error message is also silenced :(

No comments:

Post a Comment

Contributors