Are you Null?

Within the last couple of days Microsoft released a proposed update for the next major release of C# version 8.  Over the past several years, there has been a large debate on the existence and use of null in software development.  Allowing null has been heralded as the billion dollar mistake by the null reference inventor, Sir Tony Hare. With this, Microsoft has decided to help the C# community by adding functionality to the C# compiler to help point out where a null reference might occur.

With the release of C# 8, anything referencing an object (string, etc.) must explicitly declare itself as possibly being null, and if that variable isn’t explicitly checked before being used, the compiler generates a warning that a possible null reference might occur. So how does this work? By using the ? at the end of a reference type, it signifies the developer acknowledges null might occur.

This looks like it would be a breaking change, and all code written in a previous version will suddenly stop compiling. This would be true except for two things.

  1. You must use a compiler flag to enforce the rule.
  2. The flag will only generate warnings not errors.

So legacy code is safe in the upgrade process if it’s too difficult to convert.

With this, they are still working out a number of scenarios that prove tricky to handle. These are things like default array initialization (new string[2]). Their comments about all of these can be found on their blog on MSDN

I’ve added their code examples below of edge cases they are still working on:

Personally, I hoped the compiler would enforce these rules a little stronger. Some languages like F# strictly enforce variable immutability unless explicitly allowed, and other functional languages do not allow it at all.

It is possible to turn on “Warnings as errors” and have the compiler stop if it encounters a possible null exception, but this assumes the rest of the code has no other warnings that won’t stop compilation. Ideally, no warning flags should ever appear in code without being fixed, but that is a very difficult standard follow when dealing with legacy code from years past where no one followed that rule before you. Either way, the C# team was in a tight situation, and they did the best they could. They needed to make strides towards making null references easier to track, but they couldn’t break all of the legacy code using previous versions of C#.

Quirks with Pattern Matching in C# 7

With C# 7, Microsoft added the concept of pattern matching by enhancing the switch statement. Compared to functional languages (both pure and impure), this seems to be somewhat lacking in a feature by feature comparison, however it is still nice in allowing a cleaner format of code. With this, there are some interesting quirks, that you should be aware of before using. Nothing they’ve added breaks existing rules of the language, and with a thorough understanding how the language behaves their choices make sense, but there are some gotchas that on the surface looks like they should function one way, but act in a completely different manner.

Consider the following example.

Shows

C# 7 now allows the use of a switch statement to determine the type of a variable. It as also expanded the use of is to include constants including null.

is can show if something is null : shows true

With these two understandings, which line executes in the following code?

Shows default code executed.

Based on the previous examples, its a reasonable conclusion that the one of the first two case statements would execute, but they don’t.

The is operator

The is operator was introduced in C# 1.0, and its use has been expanded, but none of the existing functionality has changed. Up until C# 7, is has been used to determine if an object is of a certain type like so.

This outputs exactly as expected. The console prints “True” (Replacing string with var works the exactly the same. Remember that the object is still typed. var only tells the compiler to figure out what type the variable should be instead of explicitly telling it.)

Is Operator String: True

What happens if the string is null? The compiler thinks its a string. It will prevent you from being able to pass it to methods requiring another reference type even though the value is explicitly null.

Type is null

The is operator is a run time check not a compile time one, and since it is null, the runtime doesn’t know what type it is. In this example, the compiler could give flags to the runtime saying what type it actually is even though it’s null, but this would be difficult if not impossible for all scenarios, so for consistency, it still returns false. Consistency is key.

Printing out True and False is nice, but it’s not really descriptive. What about adding text to describe what is being evaluated.

Is Type With Question, Question doesn't appear

Why didn’t the question appear? It has to do with operator precedence. The + has a higher operator precedence than is and is evaluated first. What is actually happening is:

This becomes clear if the clause is flipped, because the compiler doesn’t know how to evaluate string when using the + operator.

Flipping clauses throws error.

Adding parenthesis around the jennysNumber is string fixes the issue, because parenthesis have a higher operator precedence than the + operator.

output of is operator and + flipped with parenthesis (shows both question and value)

Pattern Matching with Switch Statements

Null and Dealing with Types

Null is an interesting case, because as shown during the runtime, it’s difficult to determine what type an object is.

Base Example

This code works exactly as how you think it should. Even though the type is string, the runtime can’t define it as such, and so it skips the first case, and reaches the second.

Adding a type object clause works exactly the same way

shows object case works same way

What about var. Case statements now support var as a proposed type in the statement.

If you mouse over either var or the variable name, the compiler will tell you what type it is.
show compiler knows what type it is.

Shows var case statement doesn't know type

It knows what the type is, but don’t let this fool you into thinking it works like the other typed statements though. The var statement doesn’t care that the runtime can’t determine the type. A case statement with the var type will always execute provided there is no condition forbidding null values when (o != null). Like before, it still can’t determine the type inside the case statement statement.

Why determine object type at compile time?

At any point in time (baring the use of dynamic), the compiler knows the immediate type of the variable. It could use this to directly point the correct case concerning the type. If that were true, it couldn’t handle the following scenario, or any concerning inheritance of child types.

shows is string

Personally, I would like to see either a warning or an error, that it’s not possible for type cases to determine if the variable is null case string s when (s is null), but as long as the code is tested and developers knows about this edge case, problems can be minimized.

All the examples can be found on github: https://github.com/kemiller2002/StructuredSight/tree/master/PatternMatchingQuirks_Standard

C# 7 Additions – Pattern Matching

C# 7 has started to introduce Pattern Matching. This is a concept found in functional programming, and although it isn’t fully implemented compared to F#, it is a step in that direction. Microsoft has announced they intend on expanding it in future releases.

Constant Patterns

The is keyword has been expanded to allow all constants on the right side of the operator instead of just a type. Previously, C#’s only valid syntax was similar to:

Now it is possible to compare a variable to anything which is a constant: null, a value, etc.

Behind the scenes, the is statement is converted to calling the Equals function in IL code. The following two functions produce roughly the same code (they call different overloads of the Equals function).

CheckIsNull

CheckEqualsNull

This can also be combined with other features allowing variable assignment through the is operator.

In Visual Studio Preview 4, the scoping rules surrounding variables assigned in this manner are more restrictive than in the final version. Right now, they can only be used within the scope of the conditional statement.

Switch Statements

The new pattern matching extensions have also extended and changed the use of case statements. Patterns can now be used in switch statements.

Like in previous versions, the default statement will always be evaluated last, but the location of the other case statements now matter.

In this example, case int n will never evaluate, because the statement above it will always be true. Fortunately, the C# compiler will evaluate this, determine that it can’t be reached and raise a compiler error.

The variables declared in patterns behave differently than others. Each variable in a pattern can have the same name without running into a collision with other statements. Just as before, in order to declare a variable of the same name inside the case statement, you must still explicitly enforce scope by adding braces ({}).

Pattern matching has a ways to go when compared to its functional language equivalent, but it is still a nice addition and will become more complete as the language evolves.

C# 7 Additions – Literals

A small, but nice chance in C# 7 is increased flexibility in literals. Previously, large numeric constants had no separator, and it was difficult to easily read a large number. For example, if you needed a constant for the number of stars in the observable universe (1,000,000,000,000,000,000,000), you’d have to do the following:

If you hadn’t caught the error, the constant is too short, and it’s difficult to tell looking at the numbers without a separator. In C# 7, it’s now possible to use the underscore (_) in between the numbers. So the previous example now becomes much easier to read, and it is easily recognizable the number is off.

The new version adds binary constants too. Instead of writing a constant in hex, or decimal, a constant can now be written like so:

C# 7 Additions – Throw Expressions

In previous versions, throwing exceptions had certain limitations where they could be used. Although not hampering, at times it caused additional work to validate and throw an exception, and C# 7 has removed much of the developer overhead for validation and execution.

Expressions

Previously to throw an exception in the middle of an expression there were really two options:

or

It is now possible to also throw an exception in the middle of an expression. Instead of checking for null, it is possible to throw as the second condition in the Null Coalescing Operator.

It is also possible in the Conditional Operator as well.

Expression Bodied Members

C# 6 added the ability to write a method with a single statement with a “fat arrow” (=>) and the statement. What used to be

can now be condensed to:

If you need a method stub, because you don’t know how to complete the method, and it was appropriate to use an Expression Bodied Member, you were left with two possibilities as throwing an expression wasn’t allowed by the compiler.

or

The first is error prone, because if program calls the method, there is no indication that it isn’t functioning properly. (Is null an expected return or an indicator of an error?) The second is better, but it is a little cumbersome that you must convert it to a standard function just to throw the exception. C# 7 solves this inconvenience and is now possible to throw exceptions in the Expression Bodied Member.

C# 7 Additions – ref Variables

C# 7 expands the use of the ref keyword. Along with its previous use, it can now be used in return statements, and local variables can store a reference to the object as well. At first glance, the question is “What is the real difference between returning a ref variable, and setting it through an out parameter?” Previously you could set a variable passed into a function with ref (or out) to a different value. In C# 7, you can return the reference of a property, variable etc. and store that in a local variable for later use.

The following is an examples showing its expanded use.

As expected, the PersonInformation object is passed into the GetName function which returns a reference to the string property Name. This is then passed into the MakeCapitalized function which capitalizes the name “jenny” (making it “Jenny”) in the original PersonInformation object. Compare this to the example here showing how the previous version of C# would not allow the modification of the original property in the same scenario.

Classes vs Structs

If the PersonInformation is changed to be a struct (value type) instead of a class (reference type), the following code won’t work without a slight modification, but it is still completely possible.

Structs are passed by value meaning that passing a struct into a method creates a copy of it. Returning a reference to the struct’s property would return a reference to the copied struct and would go out of scope as soon as the method completes. There would be no point, and it would cause errors pointing to properties to objects which didn’t exist.

Caveats

With these new features there are some restrictions to it. Consider this. A string can be treated as an array of characters. With the new functionality, it should be possible to pass back a reference to a character location in that string and update it, because you have the reference to the character location in the string.

Fortunately, this isn’t allowed. The compiler prevents from it being a valid option, because if this were possible, it would break the string’s immutability and cause havoc with C#’s ability to intern strings.
ref string not allowed.

The compiler is also smart enough to not allow references to variables which fall out of scope. The following is also not allowed:

After the method exits someNumber no longer exists, and when another part of the application tries to access it, it won’t be available. (You could say this might not be the case if it were a reference type like a string, but it still wouldn’t matter, because all the reference has is a location to where the object is, not the actual object itself. This causes 2 problems: One, currently there is no way to get the value from the reference. Two, the object isn’t rooted, so it could still be garbage collected at any point in time.)

The compiler is also smart enough to trace the variable use through the calling methods. This is also not allowed:

C# 7 Additions – Out Variables

C# 7 removes the need for out variables to be predeclared before passing them into a function.

It also now allows the use of the var keyword to declare the variable type, because the compiler will infer the type based on the declared parameter type. This is not allowed when the compiler can’t infer the type because of method overloading. It would be nice if the compiler would attempt to infer it’s type based on the use later on in the method similar to F#’s inferred types, but this isn’t slated to be in the current release.

compiler confused because of method overloading.

In Visual Studio 15 Preview 4, the out variable isn’t working exactly as it will in the final release. Wild cards will hopefully be added so extraneous variables don’t need to be declared.

The following code won’t work until the scope restrictions on out variables is updated. (They have said they intend on doing this before the release.)

In this example, the scope is limited to the method call where the strings are set. To get it work currently, variable scope must be extended and can be like so:

The conditional statement wraps the variables and they can now be used in the Console.WriteLine. This will be corrected in the final release and won’t be necessary.

C# 7 Additions – Deconstructors

C# has a new type of method, the Deconstructor. When a type implements this method type with the name of Deconstruct, multiple variables maybe directly assigned as a return type would.

The method must be named Deconstruct and have a return type of void. The parameters to be assigned all must be out parameters, and because they are out parameters with a return type of void, C# allows the Deconstruct method to be overloaded solely based on these parameters. This is how the new System.ValueTuple allows it’s properties to be assigned to separate variables without assigning each one individually.

Deconstruct also does not need to be directly attached to the class. C# allows the method to be implemented as an extension method as well.

At the moment it is uncertain if wildcards will be added allowing unneeded variables to be omitted from being assigned. This addition would allow the insertion of the * to indicate a parameter is not needed (similar to _ in F#)

C# 7 Additions – Local Functions

In C# 7 it is now possible to create a function within a function termed a Local Function. This is for instances where a second function is helpful, but it’s not really needed in the rest of the class. It’s created just like regular functions except in the middle of another function.

Just like normal functions, you can create expression bodied members as well

Local variables in the outer functions are accessible, and it’s possible to embed local functions inside other local functions:

So how does it work? Looking at the IL code, the compiler has converted the internal function into a private static one inside the class.

IL Code showing private static function

The name is generated at compile time, so it is not accessible to other methods, but it is still possible to access it through reflection with the private and static binding flags.

reflection shows local function.

Someone I know asked what would be a good use case of Local Functions vs. Lambdas. Lambdas can’t contain enumerators, and by encasing an enumerations in a local function it allows others parts of the outer method to be eagerly evaluated. For example, if you have a method which takes a parameter and returns an enumeration, the evaluation of the parameter won’t occur until program starts to enumerate the collection. Encapsulating the enumeration in a local function allows the other parts of the outer function to be eagerly evaluated. You can find an example of the difference between using one and not using one here.

C# 7 Additions – Tuples

In C# 7 Microsoft has introduced an updated Tuple type. It has a streamlined syntax compared to it’s predecessor making it fall it look more like F#. Instead of declaring it like previous versions, the new Tuple type looks like:

Likewise to declare it as a return type, the syntax is similar to declaring it:

The first thing to note about the new type is that it is not included automatically in a new project. If you immediately use it, you’ll see the following error.

As of VS 15 preview 4 (not to be confused with VS 2015), you must include the System.ValueTuple Nuget package to take advantage of it.

This raises the question about how the new Tuple type and the previous one included since .NET 4 are related? They’re not. They are treated as two different types and are not compatible with each other.  System.Tuple is a reference type and System. ValueTuple is a value type.

So what are advantages over the previous version? The syntax simpler, and there are several other advantages.

Named Properties

In the System.Tuple version, properties of the return object were referenced as Item1, Item2 etc. This gets confusing when there are multiples of the same type in the Tuple as you have to know what position had which value type.

Now it’s possible to explicitly name the item types to reduce confusion.

The Item properties (Item1, Item2, etc.) have also been included allowing methods to be updated to the new type without breaking code based on it’s predecessor.

It’s also possible to explicitly name the values when creating the object:

Deconstruction

It is now possible to name and assign variable values upon creating (or returning) a tuple. Although not necessary, it reduces the amount of code necessary to pull values out of the type.

It’s not certain if C# will get wildcards like F# to automatically discard values which aren’t needed. If they are allowed then it’s possible to only create a variable for the name like so:

Updating Values

System.Tuple is immutable.  Once created it’s not possible to update any of the values.  This restriction has been removed in the new version.  From a purely functional perspective this could be considered a step backwards, but in C# many people find this approach more forgiving and beneficial.

Like all value types, when it is passed into a method, a copy of the tuple is created, so modifying it in the in the method does not affect the original.

However if you compare two different tuples and they have the same values, the Equals method compares the values in each and if they are all equal, it considers them equal.

Integrations with Other Languages

Unfortunately, C#’s new tuple type doesn’t automatically allow it to translate tuples from F#.

F# can’t desconstruct the values like it can with it’s native tuples, and to return it, you have to explicitly instantiate the object type and add the values.

Either way, the translation to F# isn’t horrible as it acts like any other object passed to it by C#.