Quirks with Pattern Matching in C# 7

With C# 7, Microsoft added the concept of pattern matching by enhancing the switch statement. Compared to functional languages (both pure and impure), this seems to be somewhat lacking in a feature by feature comparison, however it is still nice in allowing a cleaner format of code. With this, there are some interesting quirks, that you should be aware of before using. Nothing they’ve added breaks existing rules of the language, and with a thorough understanding how the language behaves their choices make sense, but there are some gotchas that on the surface looks like they should function one way, but act in a completely different manner.

Consider the following example.

Shows

C# 7 now allows the use of a switch statement to determine the type of a variable. It as also expanded the use of is to include constants including null.

is can show if something is null : shows true

With these two understandings, which line executes in the following code?

Shows default code executed.

Based on the previous examples, its a reasonable conclusion that the one of the first two case statements would execute, but they don’t.

The is operator

The is operator was introduced in C# 1.0, and its use has been expanded, but none of the existing functionality has changed. Up until C# 7, is has been used to determine if an object is of a certain type like so.

This outputs exactly as expected. The console prints “True” (Replacing string with var works the exactly the same. Remember that the object is still typed. var only tells the compiler to figure out what type the variable should be instead of explicitly telling it.)

Is Operator String: True

What happens if the string is null? The compiler thinks its a string. It will prevent you from being able to pass it to methods requiring another reference type even though the value is explicitly null.

Type is null

The is operator is a run time check not a compile time one, and since it is null, the runtime doesn’t know what type it is. In this example, the compiler could give flags to the runtime saying what type it actually is even though it’s null, but this would be difficult if not impossible for all scenarios, so for consistency, it still returns false. Consistency is key.

Printing out True and False is nice, but it’s not really descriptive. What about adding text to describe what is being evaluated.

Is Type With Question, Question doesn't appear

Why didn’t the question appear? It has to do with operator precedence. The + has a higher operator precedence than is and is evaluated first. What is actually happening is:

This becomes clear if the clause is flipped, because the compiler doesn’t know how to evaluate string when using the + operator.

Flipping clauses throws error.

Adding parenthesis around the jennysNumber is string fixes the issue, because parenthesis have a higher operator precedence than the + operator.

output of is operator and + flipped with parenthesis (shows both question and value)

Pattern Matching with Switch Statements

Null and Dealing with Types

Null is an interesting case, because as shown during the runtime, it’s difficult to determine what type an object is.

Base Example

This code works exactly as how you think it should. Even though the type is string, the runtime can’t define it as such, and so it skips the first case, and reaches the second.

Adding a type object clause works exactly the same way

shows object case works same way

What about var. Case statements now support var as a proposed type in the statement.

If you mouse over either var or the variable name, the compiler will tell you what type it is.
show compiler knows what type it is.

Shows var case statement doesn't know type

It knows what the type is, but don’t let this fool you into thinking it works like the other typed statements though. The var statement doesn’t care that the runtime can’t determine the type. A case statement with the var type will always execute provided there is no condition forbidding null values when (o != null). Like before, it still can’t determine the type inside the case statement statement.

Why determine object type at compile time?

At any point in time (baring the use of dynamic), the compiler knows the immediate type of the variable. It could use this to directly point the correct case concerning the type. If that were true, it couldn’t handle the following scenario, or any concerning inheritance of child types.

shows is string

Personally, I would like to see either a warning or an error, that it’s not possible for type cases to determine if the variable is null case string s when (s is null), but as long as the code is tested and developers knows about this edge case, problems can be minimized.

All the examples can be found on github: https://github.com/kemiller2002/StructuredSight/tree/master/PatternMatchingQuirks_Standard

Compile Time Stored Procedures

(code for this project can be found here)

One of the largest problems with interacting with databases is the lack of compile time assurance that the application code and the SQL code interact correctly. Testing helps with finding the errors, but as a system grows in size and complexity it becomes both difficult and costly to find every mistake. There are pre-made solutions such as F# Type Providers or Entity Framework, but these present their own challenges such as not having the project in F# or avoiding the heavy handedness of Entity Framework.

Fortunately SQL server provides enough information about Stored Procedures in the Metadata tables that makes it possible to create a from scratch compile time safe mapping between stored procedures and application code. Combining this with
T4 Text Templates it’s possible to automatically generate all the code necessary to handle stored procedure inputs and returns.

The first part is to pull all the stored procedure data from the sys.procedures view.

I remove all the procedures starting with sp_ as this prefix is meant to designate that it is a system procedure for use by SQL Server. The three important pieces from the table are the Procedure Name (p.name), Object Identifier (Object_Id), and the Schema Name (s.name).

Select Sys Procedures

Armed with the object id it's now possible to pull all the data for a procedures parameters from the sys.parameters view.

With this and the procedure's name it's possible to construct code to call SQL Server stored procedures which is compile time safe. Simply generate a method which takes in the appropriate parameters to call the stored procedure. The last step is to generate a lookup table which translates the SQL datatypes to ones which are valid C# syntax.

Idenfitfing which parameters have a default value and be not passed to the procedure

Since stored procedure parameters can have default values and therefore aren't necessary to populate, it would be nice to be able to know which one's are available but aren't required. This is a little trickier. There is a field has_default_value which looks like it would tell you if the parameter has a default value and therefore is optional, but there is a problem in the SQL Server metadata table, and it is only correct for CLR parameters, and not SQL ones. In order to actually find out if parameter has a default value, there are two options.

  1. Use Sql Management Objects (SMO)
  2. Pull the stored procedure code using something like SELECT OBJECT_DEFINITION (PROCEDURE_OBJECT_ID)

Using SMO to pull the definition is actually relatively simple assuming you have access to the appropriate dlls from Sql Server Management Studio.

Code:

If you don’t have access to the SMO dlls, you can still use something akin to a regular expression (or some other parsing method) to pull the parameter information out from the Object_Definition. As with parsing anything, there are a number of gotchas to watch out for, but it is possible with a little bit of work.

User Defined Tables

Starting in SQL Server 2008 it is possible to pass tables as parameters into stored procedures. There is view called sys.table_types which lists all of the user defined tables and has the necessary columns to link it back to which stored procedures use it as a parameter, and retrieve its column definition.

Stored Procedure Output

To automatically generate C# code for SQL Server output, you’ll really need SQL Server 2012 and later for it to be effective. In that edition they added, the dm_exec_describe_first_result_set_for_object procedure

This procedure takes a stored procedure Object_Id and returns the column information concerning the first result set it finds.

Result Set

The great thing about this procedure is that not only can you generate classes which can automatically update their types and names based on a result set that changes during development, but the procedure will return nothing if the stored procedure is non functional because of a changes in the database. By running this this and having it map all the result sets, the ones which come up blank which should return data indicate the procedure will not work with the current schema and needs to be fixed. It quickly helps eliminate the difficulties of knowing which procedures will break during program execution.

Putting It All Together

With these procedure, all that is really necessary is something that opens a connection to the database, pulls the necessary data and create the appropriate C# files.

As an example and a quick starter I’ve created an example solution. The SQL metadata retrieval is in the Classes.tt file, and all the code C# generation is in the StoredProcedures.tt file. The database I used to generate the example code is found here, and uses the Adventure Works Database.

Language Comparison Euler Problem One

Question:

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.

Find the sum of all the multiples of 3 or 5 below 1000.

F#

F# Example One

or

F# Example Two

C#

C# Example One

or

C# Example Two

JavaScript

JavaScript Example One

SQL

or

PowerShell

or

Non Failing Scripts

One of the most frustrating parts of database development is double applying scripts. If you are working directly on the production server, this problem is alleviated by making alterations to the SQL commands along the way. With doing development in a test environment first this becomes much more difficult. If you are creating a new stored procedure, the first time that it’s run you must use the CREATE command, and subsequently the ALTER command every time after. This makes handing the scripts off to a database team to run on the production servers difficult, because you have to account for the environment you are developing becoming out of sync with the environment your scripts will ultimately be run in.

Ideally, you could refresh your database environment to mirror production frequently to keep synchronicity, but in reality this is time consuming and many times not possible. The key becomes to write scripts so they can be applied to any environment either once or multiple times without causing errors. In my previous post I talked about how to use generators to make data scripts re-runnable without accidentally double inserting records. Using SQL Server’s system tables it’s relatively easy to do the same for tables, columns, stored procedures etc.

For stored procedures, most places just prepend a DROP PROCEDURE command and then create the stored procedure each time. The problem with this is that it removes permissions, and it’s easy to miss recreating them. I prefer to create a dummy procedure immediately before, and then using the alter command to change it to what it should be. If the procedure is already in the environment, it skips the step and immediately moves to the alter portion of the script.

The same thing can be done with adding or dropping columns

Virtually everything in MSSQL Server can be queried by the system tables:

  • views : sys.Views
  • constraints: sys.SysConstraints
  • schemas: sys.Schemas
    (you can link to this in your queries if your tables etc. aren’t under the default)

These are just a few of the system tables for querying data. If you need additional information about a database object, you can always link back to sys.objects. For Example, sys.SysConstraints doesn’t contain the constraint name, so to get it, all you need do is create a join.

The Book Of F#

The Book of F#

The book comes out this month, by Dave Fancher F# MVP. It’s a great book! It has an easy to read narrative and a fun style, and is designed for .Net Object Oriented people who want an introduction to both functional programming and to learn F#. It’s a great reference book and learning tool. Even if you don’t want to program F#, the functional approach to programming will teach you a lot, with great examples and easy to read explanations. I can honestly say the book never gets old. (I was the technical reviewer and got to read each chapter about 4 times. Think about how many technical books you would read 4 times without contemplating putting a fork in your eye. Honestly, with his I didn’t, and the inside sci-fi references are fun too!)

On top of this, the introduction is awesome, and it explains exactly why you want to learn functional programming and F#.

The book can be found on the No Starch Press site.