C# quirks
No language is free of quirks. Here I have a few C# quirks that I came across recently.
The array indexer is special
Unlike C++, C# doesn’t have a generalized notion of variable references. This means that the indexer of the regular array needs to be part of the language and can’t be simulated by a user defined type.
Consider the example:
In the code above a[0]
is guaranteed by the language specification to be a variable, while b[0]
is translated into a method call that returns a copy of the list element. The distinction is important for value types, because modifying a temporary copy would be pointless.
Remember that the underlying memory layout of both containers is exactly the same. The elements, being of a value type, are laid out in memory sequentially, with no boxing involved. List<T>
is actually implemented as a wrapper over a regular array and its specification has this guarantee:
If a value type is used for type T
, the compiler generates an implementation of the List<T>
class specifically for that value type.
Nonetheless the two container types have different indexers.
Deconstruction is limited
A prominent feature of value tuples in C# 7.0 is the ability to deconstruct them.
Deconstruction is not limited to value tuples and can also be used for user-defined types, provided they implement a Deconstruct
method.
Unfortunately, Dictionary<TKey, TValue>
doesn’t benefit from the new feature out-of-the-box.
The good news is that it is possible to implement an extension method for KeyValuePair<>
There’s a KeyValuePair<K, V>.Deconstruct
method available in .NET Core 2.0, but it is yet to reach .NET Framework.
AFAIK there’s no workaround for LINQ. Both from
and let
forbid deconstructing an object.
This code doesn’t compile. One day the LINQ syntax might probably acquire a pattern matching mechanism that will handle tuple deconstruction as one of its features. Alas, for now LINQ is left behind.
Closures may capture more state than you expect
Consider the following code
The output is mundane
f1() = 6, f2() = 15
weakX.IsAlive = False
weakY.IsAlive = False
As you might expect Make
creates two closures: f1
that references x
and f2
that references y
. This means the objects referenced by x
and y
outlive the invocation of Make
and aren’t automatically collected when the stack of Make
is destroyed.
When Make
returns, Main's
stack now references the two closures and their corresponding states. Soon every local variable of Main
is reset and the closures and their states can be safely collected. Indeed this is exactly what happens.
What might be less obvious is that commenting out one of the following two lines, say
makes the program produce the following output
f1() = 6, f2() = 15
weakX.IsAlive = True
weakY.IsAlive = True
Why are the both states (arrays) alive? Isn’t there only one closure alive at this point?
The answer is simple. The closures don’t have a separate state each. Both reference the same chunk of memory that contains pointers to both array.
Now it’s easy to see why collecting only one of the closures leaves both arrays alive.
In production this behavior is sometimes easy to miss. One closure produced by a factory might outlive other closures and you might only discover this fact after some frustrating time with a profiler.
Potential memory bugs notwithstanding, the current closure implementation is actually appropriate in the majority of cases. Allocating several state objects instead of one, even when possible, would lead to poorer performance.