Things you Should do with Strings While Your Coworkers are on Holiday and No One is Checking the Production Code Branch

Hi everyone! This is part of the really cool new CS Advent Calendar run by Matthew Groves! Go check out all the really great articles by everyone!

In a non infrequent basis, interviewers ask the question, “What is a string?” and they are looking for a quick answer similar to, “It is an immutable reference type.” This normally, sparks follow up questions such as, “explain what immutable means in this scenario,” or “so are there any examples where you can change a string?” The most common answers is, “No,” and with good reason. Adding two Strings together creates a new third String. Calling methods like ToUpper() doesn’t modify the one being operated on. It creates a new string, and although strings can be treated like an array of characters, the compiler prevents the modification of those characters in their specific positions.

Technically, the more correct answer is, “It depends.” Under most circumstances, it is not possible by design, and rightfully so. There are several factors dealing with efficiency and predictability that rely on this fundamental idea, but this doesn’t encompass the “allow unsafe code” compiler option. This is in a sense cheating, as it goes against established ideas of how most .NET applications work, but with this, it is possible to mutate a string using the fixed statement, and exploring it exposes some interesting behaviors of the .NET runtime.

To elucidate this, I created an assembly project and a unit test project to show various scenarios using the fixed statement and what happens. In these examples, the unit tests don’t actually test for validity. They merely bootstrap the test methods and print the results.

So what is happening with this code? The first necessity is to understand what the fixed statement does. According to the C# Language Reference:

The fixed statement sets a pointer to a managed variable and “pins” that variable during the execution of the statement. Without fixed, pointers to movable managed variables would be of little use since garbage collection could relocate the variables unpredictably. The C# compiler only lets you assign a pointer to a managed variable in a fixed statement.

With the fixed statement, it is possible to change a string in place which breaks its concept of immutability. The unit test:

  • prints the public readonly string “Bah Humbug!!!!!”
  • runs the method which alters that string
  • prints the same string which is now “Happy Holidays!”
  • show output of unit test.

    Now what happens when a local string is modified that is the exact same as the class level string?

    At first glance, the local string (localSeasonsGreetings) should be modified, and the class level string (SeasonsGreetings) should be unchanged.

    In this example, the unit test runs the method which prints out the values of the local string and the class level string, and then the unit test prints out the value of the class level string.

    copy of string results

    The local string is modified, and the class level string is also changed. Why did this happen? The answer lies in String Interning. When a literal string becomes accessible by the program, it is checked against the intern pool (a table which houses a unique instance of each literal string or ones that have been programmatically added). If the literal already exists within that table, a reference to the string in the table is returned instead of creating a new instance. Since the two string entries in the example are the same (Bah Humbug!!!!!), the runtime actually creates one reference for both of them, and hence, when one is modified, the other is affected.

    So what happens if we piece together the string at runtime from two constants?

    Notice in the example code above, the localSeasonsGreetings literal is changed to:

    mutate pieced together copy of string.  Shows different results.

    Since the local variable instance of Bah Humbug!!!!! was created when the method was run (and is not a literal), the CLR created a new instance of this string. When this local instance was modified, the class level variable instance was not differing from the previous example.

    What happens when the same string value is in different assemblies?

    Mutate local copy of bah humbug value in test assembly.  Is modified.

    Based on the previous examples, it works how you would expect it to. Since String Interning is controlled by the CLR and not during compile time, which assembly the string is located in doesn’t matter. All literals loaded into memory are added to the same pool, so modifying the value in one assembly affects all other instances in the entire application.

    Up until this point, we’ve only seen the effects of String Interning on instances of a string. What happens if we return a literal from a static method? To test this, I added a method to return “Bah Humbug!!!!!” to the ImmutableStringsExample.

    static method single call - no change

    The static method was called after the modification method ran, and it did not change. We could assume that since the method creates a new string instance, and the static method after we modified the interned “Bah Humbug!!!!!” string reference that it couldn’t find it and created a new instance. Now the question is, “Is this method deterministic?” Will this method always return a new instance of “Bah Humbug!!!!!!”?

    call static method twice change

    Clearly the answer is no. The time when the application calls the static method, determines its behavior. Now what happens with a non-static method? Are the same methods in different objects the same?

    Second instantiated object - different

    Non static methods work the same as static ones in this regard. Once ran, the CLR will make updates and return a reference to the same object.

    With the above examples, we see that Strings in .NET are really a lot more complicated than they initially let on. The runtime handles a lot of complicated optimizations, and there is a lot of work that goes on behind the scenes to ensure that efficiency. With those efficiencies come certain restrictions, such as immutability, but in the whole scope, those small restrictions can be managed and used to benefit the application.

    The code for this post can be found on GitHub

    Are you Null?

    Within the last couple of days Microsoft released a proposed update for the next major release of C# version 8.  Over the past several years, there has been a large debate on the existence and use of null in software development.  Allowing null has been heralded as the billion dollar mistake by the null reference inventor, Sir Tony Hare. With this, Microsoft has decided to help the C# community by adding functionality to the C# compiler to help point out where a null reference might occur.

    With the release of C# 8, anything referencing an object (string, etc.) must explicitly declare itself as possibly being null, and if that variable isn’t explicitly checked before being used, the compiler generates a warning that a possible null reference might occur. So how does this work? By using the ? at the end of a reference type, it signifies the developer acknowledges null might occur.

    This looks like it would be a breaking change, and all code written in a previous version will suddenly stop compiling. This would be true except for two things.

    1. You must use a compiler flag to enforce the rule.
    2. The flag will only generate warnings not errors.

    So legacy code is safe in the upgrade process if it’s too difficult to convert.

    With this, they are still working out a number of scenarios that prove tricky to handle. These are things like default array initialization (new string[2]). Their comments about all of these can be found on their blog on MSDN

    I’ve added their code examples below of edge cases they are still working on:

    Personally, I hoped the compiler would enforce these rules a little stronger. Some languages like F# strictly enforce variable immutability unless explicitly allowed, and other functional languages do not allow it at all.

    It is possible to turn on “Warnings as errors” and have the compiler stop if it encounters a possible null exception, but this assumes the rest of the code has no other warnings that won’t stop compilation. Ideally, no warning flags should ever appear in code without being fixed, but that is a very difficult standard follow when dealing with legacy code from years past where no one followed that rule before you. Either way, the C# team was in a tight situation, and they did the best they could. They needed to make strides towards making null references easier to track, but they couldn’t break all of the legacy code using previous versions of C#.

    Functional Languages in the Workplace

    On a semi regular basis, people question why I choose to use F# to implement projects. They question why use a lesser known language when one like C# has a larger developer pool and is more widely documented. I explain to them my rational behind it, siting personal experience, and documented cases about others success stories as well. There is significant evidence showing functional languages can reduce commonly occurring defects due to their inherent nature of immutability provide easier support for scalability, and have a stronger type system allowing for more expressive code. There are numerous testimonials on the use of functional languages and their benefit, but after hearing all of this, they are still doubtful about even considering a change. Assuming this evidence is correct, the question of “Why isn’t this a serious choice for the majority of organizations?” continues to appear.
    During discussions about switching to a functional language, I repeatedly hear the several common questions and arguments for resisting change. Most of these embody fear, uncertainty, and doubt. Several can be applied to moving to any technology, and although they should be considered, they are nothing which cannot be overcome. Here are my responses to the most common arguments against change I receive.

    Our code is already written in language X, and it will be hard to make a change

    There will always be legacy code, and it probably deviates from the standards used today. Was it written in a previous version of the currently used language? Does it contain libraries that are no longer supported? Was it written in such a way that converting it to current standards is difficult or impossible? If the answers these questions is yes, that doesn’t mean that other projects suffer the same fate.
    Legacy code can’t hold you back from technological advancements, and it most likely doesn’t now. Over the last several years many software vendors have made sweeping changes to languages and technologies leaving them looking only vaguely like what they did when first created. The introduction of Generics, the inclusion of Lambda Expressions, and asynchronous additions made huge advancements in several different languages and greatly changed common approaches to solving problems. These enormous changes didn’t stop organizations from modernizing their many of their applications to take advantage of new features even though code written with them is radically different than in previously created applications.
    Radical shifts in technology happen all the time, and almost every organization shifts its strategies based on trends in the industry. Organizations which defer changes to their current approach often find greater difficulty in migrating the longer they wait due to the fact that they continue to implement solutions using their current approach. Mindlessly shifting from one approach to another is never a wise decision. That introduces chaos, but neglecting trying new approaches due to legacy concerns can only end in repeating the same mistakes.

    Our developers don’t know language Y. It will be too hard and costly for them to learn and migrate.

    A developer’s job is to learn every day. There are new features to understand, new architecture patterns to master, and new languages to learn. The list is endless. The belief that at any stage in one’s career the road to deeper understanding ends, is myopic and ultimately an exit ramp to another profession or a stagnant career. Developer’s should be challenged. Organizations should push their staff to understand new things, and compared to the opportunity cost of repeating the same mistakes, the amount of time and money required to train people is often negligible, especially with tools like books, video learning, computer base training, etc.
    There are some people that have no desire to continue learning, and that’s ok. New development isn’t for everyone, and going back to the previous point, there are always applications in need of support that won’t or can’t be converted. Organizational migration to a technology is almost never an all or nothing approach, and some solutions should be left exactly how they are, because of the cost of converting them will outweigh the benefits. There will be people to maintain those in the long term, and these solutions cannot be the bedrock against advancing how other projects progress.

    What if we fail and we are stuck with a language we can’t use?

    If an organization takes the leap of faith and switches to a functional language what is the probability of some failure during the process? The initial answer is, 100%. Everyone fails every day at something. Failure is inevitable. With this in mind, you’re already failing at something, so the question is what are you going to do to try and fix it? You’re going to create other problems too, but with planning, retrospective analysis, and learning from those mistakes, those will be solved as well, but ultimately the position you end at will be further along than where you started.
    A few years ago, I had a discussion with an organization about their development practices. They were extremely adept at knowing where their time was allocated: support, feature enhancements, refactoring, etc. When asked about their breakdown, they explained on average 30% of their time went to fixing production defects from previous releases. They were perplexed about why they were missing deadlines despite becoming stringent on code quality. I asked about their plan to fix it, and they responded with a few ideas, but their final answer distilled to, “write better code.” When confronted with the question, “What are you going to change?” they said, “Nothing. Changing the development process is too time consuming and costly. If we update our practices, we’ll fall further behind on our releases.” The definition of insanity is doing the same thing and expecting a different result, yet several organizations believe they can break the cycle simply by standing still. If changing how an organization develops isn’t feasible, then changing what they develop with is one of the only few viable options remaining. It is much easier to change a technology than it is to change an ingrained culture, which is exactly why using languages and tools that enforce practices which reduce errors is a much more efficient approach than convincing everyone to work in a certain way.
    Most organizations resistant to change perceive technology migrations as a revolutionary approach. They firmly believe all use of a certain technology immediately stops and the new one begins, because it is much easier to think in terms of black and white (one vs. the other) when change is a rare and uncomfortable occurrence. Change to anything new should be a cautious approach and take practice. It should be evolutionary. Organizations should try several smaller variations of an approach, learning from each and refining their ideas on gradually larger projects. Embracing a adaptation and “failure leads to a stronger recovery” approach ultimately leads to a better outcome.
    It is almost certain moving from to a functional language from an unrelated paradigm is going to be difficult and confusing, but the fault does not lay to the language itself. As with anything new, the concepts are unfamiliar to those starting to use it. There will be mistakes during the learning process, and some projects will probably take longer than expected, but basing the long-term benefits on the first attempt to implement anything will show biased result against it, and with time moving to an approach which aids developers to make fewer mistakes and write better and cleaner code will save both time and money.

    It’s not widely used enough for us to find people to support it

    My coworker recently attended two meetups concerning functional programming, each having approximately 25 attendees. After the first one, he decided to do an experiment at the second. He asked people at the meetup, “How many of you use a functional language at work?” and the result was astounding. Only one person admitted to it, and it was only part time. At a minimum, there are 25 people at each location that are excited enough about functional programming to attend a meetup on their own time on a topic which has nothing to do with the tools they use at work, and these people are only a representation of the larger workforce. There are many others that were either unable to attend, or were unaware of the event.
    There is almost no place in the United States that isn’t a competitive market for development staff. Large companies are able to pay higher rates and have better benefits which means they will pull the majority of the highest qualified candidates. Smaller organizations can’t offer the enormous benefits packages placing them in a difficult situation to fill needed positions. Picking a technology where there are fewer people to fill the role would seem to place those organizations at a disadvantage, but this is actuality in comparison to overall demand for those type of people. Looking solely at the number of potential applicants, the pool of functional programmers is smaller, but organizations using functional languages aren’t nearly as widespread, so they suffer less completion when searching for candidates. Furthermore, assuming the statistics surrounding the benefits of functional languages are correct, organizations will require fewer programmers accommodating the constraint of a smaller pool of applicants.


    Functional languages can be an excellent fit for organizations, both ones starting development and others which have been established for a considerable length of time. Most resilience in using them comes from misunderstanding the benefits compared to the cost of changing languages. It is neither difficult nor time consuming to attempt to better the development process by focusing on tools to better aid the process.

    Quirks with Pattern Matching in C# 7

    With C# 7, Microsoft added the concept of pattern matching by enhancing the switch statement. Compared to functional languages (both pure and impure), this seems to be somewhat lacking in a feature by feature comparison, however it is still nice in allowing a cleaner format of code. With this, there are some interesting quirks, that you should be aware of before using. Nothing they’ve added breaks existing rules of the language, and with a thorough understanding how the language behaves their choices make sense, but there are some gotchas that on the surface looks like they should function one way, but act in a completely different manner.

    Consider the following example.


    C# 7 now allows the use of a switch statement to determine the type of a variable. It as also expanded the use of is to include constants including null.

    is can show if something is null : shows true

    With these two understandings, which line executes in the following code?

    Shows default code executed.

    Based on the previous examples, its a reasonable conclusion that the one of the first two case statements would execute, but they don’t.

    The is operator

    The is operator was introduced in C# 1.0, and its use has been expanded, but none of the existing functionality has changed. Up until C# 7, is has been used to determine if an object is of a certain type like so.

    This outputs exactly as expected. The console prints “True” (Replacing string with var works the exactly the same. Remember that the object is still typed. var only tells the compiler to figure out what type the variable should be instead of explicitly telling it.)

    Is Operator String: True

    What happens if the string is null? The compiler thinks its a string. It will prevent you from being able to pass it to methods requiring another reference type even though the value is explicitly null.

    Type is null

    The is operator is a run time check not a compile time one, and since it is null, the runtime doesn’t know what type it is. In this example, the compiler could give flags to the runtime saying what type it actually is even though it’s null, but this would be difficult if not impossible for all scenarios, so for consistency, it still returns false. Consistency is key.

    Printing out True and False is nice, but it’s not really descriptive. What about adding text to describe what is being evaluated.

    Is Type With Question, Question doesn't appear

    Why didn’t the question appear? It has to do with operator precedence. The + has a higher operator precedence than is and is evaluated first. What is actually happening is:

    This becomes clear if the clause is flipped, because the compiler doesn’t know how to evaluate string when using the + operator.

    Flipping clauses throws error.

    Adding parenthesis around the jennysNumber is string fixes the issue, because parenthesis have a higher operator precedence than the + operator.

    output of is operator and + flipped with parenthesis (shows both question and value)

    Pattern Matching with Switch Statements

    Null and Dealing with Types

    Null is an interesting case, because as shown during the runtime, it’s difficult to determine what type an object is.

    Base Example

    This code works exactly as how you think it should. Even though the type is string, the runtime can’t define it as such, and so it skips the first case, and reaches the second.

    Adding a type object clause works exactly the same way

    shows object case works same way

    What about var. Case statements now support var as a proposed type in the statement.

    If you mouse over either var or the variable name, the compiler will tell you what type it is.
    show compiler knows what type it is.

    Shows var case statement doesn't know type

    It knows what the type is, but don’t let this fool you into thinking it works like the other typed statements though. The var statement doesn’t care that the runtime can’t determine the type. A case statement with the var type will always execute provided there is no condition forbidding null values when (o != null). Like before, it still can’t determine the type inside the case statement statement.

    Why determine object type at compile time?

    At any point in time (baring the use of dynamic), the compiler knows the immediate type of the variable. It could use this to directly point the correct case concerning the type. If that were true, it couldn’t handle the following scenario, or any concerning inheritance of child types.

    shows is string

    Personally, I would like to see either a warning or an error, that it’s not possible for type cases to determine if the variable is null case string s when (s is null), but as long as the code is tested and developers knows about this edge case, problems can be minimized.

    All the examples can be found on github:

    C# 7 Additions – Pattern Matching

    C# 7 has started to introduce Pattern Matching. This is a concept found in functional programming, and although it isn’t fully implemented compared to F#, it is a step in that direction. Microsoft has announced they intend on expanding it in future releases.

    Constant Patterns

    The is keyword has been expanded to allow all constants on the right side of the operator instead of just a type. Previously, C#’s only valid syntax was similar to:

    Now it is possible to compare a variable to anything which is a constant: null, a value, etc.

    Behind the scenes, the is statement is converted to calling the Equals function in IL code. The following two functions produce roughly the same code (they call different overloads of the Equals function).



    This can also be combined with other features allowing variable assignment through the is operator.

    In Visual Studio Preview 4, the scoping rules surrounding variables assigned in this manner are more restrictive than in the final version. Right now, they can only be used within the scope of the conditional statement.

    Switch Statements

    The new pattern matching extensions have also extended and changed the use of case statements. Patterns can now be used in switch statements.

    Like in previous versions, the default statement will always be evaluated last, but the location of the other case statements now matter.

    In this example, case int n will never evaluate, because the statement above it will always be true. Fortunately, the C# compiler will evaluate this, determine that it can’t be reached and raise a compiler error.

    The variables declared in patterns behave differently than others. Each variable in a pattern can have the same name without running into a collision with other statements. Just as before, in order to declare a variable of the same name inside the case statement, you must still explicitly enforce scope by adding braces ({}).

    Pattern matching has a ways to go when compared to its functional language equivalent, but it is still a nice addition and will become more complete as the language evolves.

    C# 7 Additions – Literals

    A small, but nice chance in C# 7 is increased flexibility in literals. Previously, large numeric constants had no separator, and it was difficult to easily read a large number. For example, if you needed a constant for the number of stars in the observable universe (1,000,000,000,000,000,000,000), you’d have to do the following:

    If you hadn’t caught the error, the constant is too short, and it’s difficult to tell looking at the numbers without a separator. In C# 7, it’s now possible to use the underscore (_) in between the numbers. So the previous example now becomes much easier to read, and it is easily recognizable the number is off.

    The new version adds binary constants too. Instead of writing a constant in hex, or decimal, a constant can now be written like so:

    C# 7 Additions – ref Variables

    C# 7 expands the use of the ref keyword. Along with its previous use, it can now be used in return statements, and local variables can store a reference to the object as well. At first glance, the question is “What is the real difference between returning a ref variable, and setting it through an out parameter?” Previously you could set a variable passed into a function with ref (or out) to a different value. In C# 7, you can return the reference of a field, variable etc. and store that in a local variable for later use.

    The following is an examples showing its expanded use.

    As expected, the PersonInformation object is passed into the GetName function which returns a reference to the string field Name. This is then passed into the MakeCapitalized function which capitalizes the name “jenny” (making it “Jenny”) in the original PersonInformation object. Compare this to the example here showing how the previous version of C# would not allow the modification of the original field in the same scenario.

    Classes vs Structs

    If the PersonInformation is changed to be a struct (value type) instead of a class (reference type), the following code won’t work without a slight modification, but it is still completely possible.

    Structs are passed by value meaning that passing a struct into a method creates a copy of it. Returning a reference to the struct’s field would return a reference to the copied struct and would go out of scope as soon as the method completes. There would be no point, and it would cause errors pointing to fields to objects which didn’t exist.


    With these new features there are some restrictions to it. Consider this. A string can be treated as an array of characters. With the new functionality, it should be possible to pass back a reference to a character location in that string and update it, because you have the reference to the character location in the string.

    Fortunately, this isn’t allowed. The compiler prevents from it being a valid option, because if this were possible, it would break the string’s immutability and cause havoc with C#’s ability to intern strings.
    ref string not allowed.

    The compiler is also smart enough to not allow references to variables which fall out of scope. The following is also not allowed:

    After the method exits someNumber no longer exists, and when another part of the application tries to access it, it won’t be available. (You could say this might not be the case if it were a reference type like a string, but it still wouldn’t matter, because all the reference has is a location to where the object is, not the actual object itself. This causes 2 problems: One, currently there is no way to get the value from the reference. Two, the object isn’t rooted, so it could still be garbage collected at any point in time.)

    The compiler is also smart enough to trace the variable use through the calling methods. This is also not allowed:

    C# 7 Additions – Out Variables

    C# 7 removes the need for out variables to be predeclared before passing them into a function.

    It also now allows the use of the var keyword to declare the variable type, because the compiler will infer the type based on the declared parameter type. This is not allowed when the compiler can’t infer the type because of method overloading. It would be nice if the compiler would attempt to infer it’s type based on the use later on in the method similar to F#’s inferred types, but this isn’t slated to be in the current release.

    compiler confused because of method overloading.

    In Visual Studio 15 Preview 4, the out variable isn’t working exactly as it will in the final release. Wild cards will hopefully be added so extraneous variables don’t need to be declared.

    The following code won’t work until the scope restrictions on out variables is updated. (They have said they intend on doing this before the release.)

    In this example, the scope is limited to the method call where the strings are set. To get it work currently, variable scope must be extended and can be like so:

    The conditional statement wraps the variables and they can now be used in the Console.WriteLine. This will be corrected in the final release and won’t be necessary.

    C# 7 Additions – Local Functions

    In C# 7 it is now possible to create a function within a function termed a Local Function. This is for instances where a second function is helpful, but it’s not really needed in the rest of the class. It’s created just like regular functions except in the middle of another function.

    Just like normal functions, you can create expression bodied members as well

    Local variables in the outer functions are accessible, and it’s possible to embed local functions inside other local functions:

    So how does it work? Looking at the IL code, the compiler has converted the internal function into a private static one inside the class.

    IL Code showing private static function

    The name is generated at compile time, so it is not accessible to other methods, but it is still possible to access it through reflection with the private and static binding flags.

    reflection shows local function.

    Someone I know asked what would be a good use case of Local Functions vs. Lambdas. Lambdas can’t contain enumerators, and by encasing an enumerations in a local function it allows others parts of the outer method to be eagerly evaluated. For example, if you have a method which takes a parameter and returns an enumeration, the evaluation of the parameter won’t occur until program starts to enumerate the collection. Encapsulating the enumeration in a local function allows the other parts of the outer function to be eagerly evaluated. You can find an example of the difference between using one and not using one here.

    It’s OK, My eval is Sandboxed (No It’s Not)

    The idea of using eval has always been in interesting debate. Instead of writing logic which accounts for possibly hundreds of different scenarios, creating a string with the correct JavaScript and then executing it dynamically is a much simpler solution. This isn’t a new approach to programming and is commonly seen in languages such as SQL (stored procedures vs. dynamically generating statements). On one hand it can save a developer an immense amount of time writing and debugging code. On the other, it’s power is something which can be abused because of it’s high execution privileges in the browser.

    The question is, “should ever be used?” It technically would be safe if there is a way of securing all the code it evaluates, but this limits its effectiveness and goes against its dynamic nature. So with this, is there a balance point where using it is secure, but also flexible enough to warrant the risk?

    For example purposes, we’ll use the following piece of code to show the browser has been successfully exploited: alert(‘Be sure to drink your Ovaltine.’); If the browser is able to execute that code, then restricting the use of eval failed.

    In the most obvious example where nothing is sanitized executing the alert is trivial:

    eval will treat any input as code and execute it. So what if eval is restricted to only execute which will correctly evaluate to a complete statement?

    Nope, this still successfully executes. In JavaScript all functions return something, so calling alert and assigning undefined to total is perfectly valid.

    What about forcing a conversion to a number?

    This still executes also, because the alert function fires when it is parsed and its return value is converted to a string and then parsed.

    The following does stop the alert from firing,

    But this is rather pointless, because eval isn’t necessary. It’s much easier to assign the value to the total variable directly.

    What about overriding the global function alert with a local function?

    This does work for the current scenario. It overrides the global alert function with the local one but doesn’t solve the problem. The alert function can still be called explicitly from the window object itself.

    With this in mind, it is possible to remove any reference to window (or alert for that matter) in the code string before executing.

    This works when the word ‘window’ is together, but the following code executes successfully:

    Since ‘win’ and ‘dow’ are separated, the replacement does not find it. The code works by using the first eval to join the execution code together while the second executes it. Since replace is used to remove the window code, it’s also possible to do the same thing to eval like so:

    That stops the code from working, but it doesn’t stop this:

    It is possible to keep accounting for different scenarios whittling down the different attack vectors, but this gets extremely complicated and cumbersome. Further more, using eval opens up other scenarios besides direct execution which may not be accounted for. Take the following example:

    This code bypasses the replace sanitations, and it’s goal wasn’t to execute malicious code. It’s goal is to replace the JSON.parse with eval and depending on the application might assume that malicious code is blocked, because JSON.parse doesn’t natively execute rogue code.

    Take the following example:

    The code does throw an exception at the end due to invalid parsing, but that isn’t a problem for the attacker, because eval already executed the rogue code. The eval statement was used to perform a lateral attack against the functions which are assumed not to execute harmful instructions.

    Server Side Validation

    A great extent of the time, systems validate user input on the server trying to ensure harmful information is never stored in the system. This is a smart idea, because removing before storing it tries to ensure everything accessing potentially harmful code doesn’t need to make certain it isn’t executing something it shouldn’t (you really shouldn’t and can’t make this assumption, but it is a good start in protecting against attacks). With eval, this causes a false sense of security, because code like C# does not handle strings the same way that JavaScript does. For example:

    In the first example, the C# code successfully removed the word ‘window’, but in the second, it was unable to interpret this when presented with Unicode characters which JavaScript interprets as executable instructions. (In order to test the unicode characters, you need to place an @ symbol in front of the string so that it will treat it exactly as it is written. Without it, the C# compiler will convert it.) Worse yet, JavaScript can interpret strings which are a mixture of text and Unicode values making it more difficult to search and replace potentially harmful values.

    Assuming the dynamic code passed into eval is completely sanitized, and there is no possibility of executing rogue code, it should be safe to use. The problem is that it’s most likely not sanitized, and at best it’s completely sanitized for now.