C# quirks

No language is free of quirks. Here I have a few C# quirks that I came across recently.

The array indexer is special

Unlike C++, C# doesn’t have a generalized notion of variable references. This means that the indexer of the regular array needs to be part of the language and can’t be simulated by a user defined type.

Consider the example:

var a = new[] { (x: 0, y: 0) };
a[0].x = 1; // OK. "The result of an array element access is a variable,
            // namely the array element selected by the indices."

var b = a.ToList();
b[0].x = 1; // error CS1612: Cannot modify the return value of
            // 'List<(int, int)>.this[int]' because it is not a variable

In the code above a[0] is guaranteed by the language specification to be a variable, while b[0] is translated into a method call that returns a copy of the list element. The distinction is important for value types, because modifying a temporary copy would be pointless.

Remember that the underlying memory layout of both containers is exactly the same. The elements, being of a value type, are laid out in memory sequentially, with no boxing involved. List<T> is actually implemented as a wrapper over a regular array and its specification has this guarantee:

If a value type is used for type T, the compiler generates an implementation of the List<T> class specifically for that value type.

Nonetheless the two container types have different indexers.

Deconstruction is limited

A prominent feature of value tuples in C# 7.0 is the ability to deconstruct them.

var a = new[] {
    (1, 2),
    (3, 4),
    (5, 6),
};

foreach (var (x, y) in a) { } // OK.

Deconstruction is not limited to value tuples and can also be used for user-defined types, provided they implement a Deconstruct method.

struct S {
    public int x, y;
    public void Deconstruct(out int x, out int y) {
        x = this.x;
        y = this.y;
    }
}

foreach (var (x, y) in new S[0]) { } // OK.

Unfortunately, Dictionary<TKey, TValue> doesn’t benefit from the new feature out-of-the-box.

Dictionary<int, int> d = null;

// error CS8129: No suitable Deconstruct instance or
// extension method was found for type 'KeyValuePair<int, int>',
// with 2 out parameters and a void return type.
foreach (var (x, y) in d) { }

The good news is that it is possible to implement an extension method for KeyValuePair<>

public static class KeyValuePairExtensions {
    public static void Deconstruct<K, V>(
        this KeyValuePair<K, V> self, out K k, out V v)
    {
        k = self.Key;
        v = self.Value;
    }
}

There’s a KeyValuePair<K, V>.Deconstruct method available in .NET Core 2.0, but it is yet to reach .NET Framework.

AFAIK there’s no workaround for LINQ. Both from and let forbid deconstructing an object.

var q = from (x, y) in a
        select x;

var q = from e in a
        let (x, y) = e
        select x;

This code doesn’t compile. One day the LINQ syntax might probably acquire a pattern matching mechanism that will handle tuple deconstruction as one of its features. Alas, for now LINQ is left behind.

Closures may capture more state than you expect

Consider the following code

static void Make(out Func<int> f1, out Func<int> f2,
                 out object xout, out object yout)
{
    var x = new[] { 1, 2, 3 };
    var y = new[] { 4, 5, 6 };
    xout = x;
    yout = y;
    f1 = () => x.Sum();
    f2 = () => y.Sum();
}

static void Main() {
    Func<int> f1, f2; object x, y;
    Make(out f1, out f2, out x, out y);
    // f1 and f2 implicitly reference x and y
    var weakX = new WeakReference(x);
    var weakY = new WeakReference(y);
    x = null; // current stack doesn't reference
    y = null; // x and y anymore
    Console.WriteLine($"f1() = {f1()}, f2() = {f2()}");
    f1 = null;
    f2 = null;
    GC.Collect();
    Console.WriteLine($"weakX.IsAlive = {weakX.IsAlive}");
    Console.WriteLine($"weakY.IsAlive = {weakY.IsAlive}");
}

The output is mundane

f1() = 6, f2() = 15
weakX.IsAlive = False
weakY.IsAlive = False

As you might expect Make creates two closures: f1 that references x and f2 that references y. This means the objects referenced by x and y outlive the invocation of Make and aren’t automatically collected when the stack of Make is destroyed.

When Make returns, Main's stack now references the two closures and their corresponding states. Soon every local variable of Main is reset and the closures and their states can be safely collected. Indeed this is exactly what happens.

What might be less obvious is that commenting out one of the following two lines, say

f1 = null;
// f2 = null;

makes the program produce the following output

f1() = 6, f2() = 15
weakX.IsAlive = True
weakY.IsAlive = True

Why are the both states (arrays) alive? Isn’t there only one closure alive at this point?

The answer is simple. The closures don’t have a separate state each. Both reference the same chunk of memory that contains pointers to both array.

capture memory

Now it’s easy to see why collecting only one of the closures leaves both arrays alive.

In production this behavior is sometimes easy to miss. One closure produced by a factory might outlive other closures and you might only discover this fact after some frustrating time with a profiler.

Potential memory bugs notwithstanding, the current closure implementation is actually appropriate in the majority of cases. Allocating several state objects instead of one, even when possible, would lead to poorer performance.