Hi everyone! This is part of the really cool new CS Advent Calendar run by Matthew Groves! Go check out all the really great articles by everyone!

In a non infrequent basis, interviewers ask the question, “What is a string?” and they are looking for a quick answer similar to, “It is an immutable reference type.” This normally, sparks follow up questions such as, “explain what immutable means in this scenario,” or “so are there any examples where you can change a string?” The most common answers is, “No,” and with good reason. Adding two Strings together creates a new third String. Calling methods like ToUpper() doesn’t modify the one being operated on. It creates a new string, and although strings can be treated like an array of characters, the compiler prevents the modification of those characters in their specific positions.

//Does not work public void TreatStringAsChars () { //Cannot be assigned to. It is read only. SeasonsGreetings[0] = 'B'; } //Does not work 1 2 3 4 5 6 7 //Does not work public void TreatStringAsChars ( ) { //Cannot be assigned to. It is read only. SeasonsGreetings [ 0 ] = 'B' ; } //Does not work

Technically, the more correct answer is, “It depends.” Under most circumstances, it is not possible by design, and rightfully so. There are several factors dealing with efficiency and predictability that rely on this fundamental idea, but this doesn’t encompass the “allow unsafe code” compiler option. This is in a sense cheating, as it goes against established ideas of how most .NET applications work, but with this, it is possible to mutate a string using the fixed statement, and exploring it exposes some interesting behaviors of the .NET runtime.

To elucidate this, I created an assembly project and a unit test project to show various scenarios using the fixed statement and what happens. In these examples, the unit tests don’t actually test for validity. They merely bootstrap the test methods and print the results.

public class ImmutableStringsExample { public readonly string SeasonsGreetings = "Bah Humbug!!!!!"; public unsafe void MutateSeasonsGreetingsString () { var happyHolidays = "Happy Holidays!"; fixed (char* seasonsGreetingsLocation = SeasonsGreetings) { for(var ii = 0; ii < happyHolidays.Length; ii++) { seasonsGreetingsLocation[ii] = happyHolidays[ii]; } } } } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 public class ImmutableStringsExample { public readonly string SeasonsGreetings = "Bah Humbug!!!!!" ; public unsafe void MutateSeasonsGreetingsString ( ) { var happyHolidays = "Happy Holidays!" ; fixed ( char * seasonsGreetingsLocation = SeasonsGreetings ) { for ( var ii = 0 ; ii < happyHolidays . Length ; ii ++ ) { seasonsGreetingsLocation [ ii ] = happyHolidays [ ii ] ; } } } }

[TestMethod] public void MutateSeasonsGreetingsString() { var immutableStrings = new ImmutableStringsExample(); Console.WriteLine(immutableStrings.SeasonsGreetings); immutableStrings.MutateSeasonsGreetingsString(); Console.WriteLine(immutableStrings.SeasonsGreetings); } 1 2 3 4 5 6 7 8 9 10 11 [ TestMethod ] public void MutateSeasonsGreetingsString ( ) { var immutableStrings = new ImmutableStringsExample ( ) ; Console . WriteLine ( immutableStrings . SeasonsGreetings ) ; immutableStrings . MutateSeasonsGreetingsString ( ) ; Console . WriteLine ( immutableStrings . SeasonsGreetings ) ; }

So what is happening with this code? The first necessity is to understand what the fixed statement does. According to the C# Language Reference:

The fixed statement sets a pointer to a managed variable and “pins” that variable during the execution of the statement. Without fixed, pointers to movable managed variables would be of little use since garbage collection could relocate the variables unpredictably. The C# compiler only lets you assign a pointer to a managed variable in a fixed statement.

With the fixed statement, it is possible to change a string in place which breaks its concept of immutability. The unit test:

prints the public readonly string “Bah Humbug!!!!!”

runs the method which alters that string

prints the same string which is now “Happy Holidays!”

Now what happens when a local string is modified that is the exact same as the class level string?

public unsafe void MutateCopyofSeasonsGreetings() { string localSeasonsGreetings = "Bah Humbug!!!!!"; Console.WriteLine ($"Local Seasons Greetings: {localSeasonsGreetings}"); Console.WriteLine ($"Class Variable Seasons Greetings: {SeasonsGreetings}"); Console.WriteLine ($"Are the two variables equal: " + $"{localSeasonsGreetings.Equals(SeasonsGreetings)}"); var happyHolidays = "Happy Holidays!"; fixed (char* seasonsGreetingsLocation = localSeasonsGreetings) { for (var ii = 0; ii < happyHolidays.Length; ii++) { seasonsGreetingsLocation[ii] = happyHolidays[ii]; } } Console.WriteLine("Modification has run."); Console.WriteLine ($"Are the two variables equal: " + $"{localSeasonsGreetings.Equals(SeasonsGreetings)}"); } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 public unsafe void MutateCopyofSeasonsGreetings ( ) { string localSeasonsGreetings = "Bah Humbug!!!!!" ; Console . WriteLine ( $ "Local Seasons Greetings: {localSeasonsGreetings}" ) ; Console . WriteLine ( $ "Class Variable Seasons Greetings: {SeasonsGreetings}" ) ; Console . WriteLine ( $ "Are the two variables equal: " + $ "{localSeasonsGreetings.Equals(SeasonsGreetings)}" ) ; var happyHolidays = "Happy Holidays!" ; fixed ( char * seasonsGreetingsLocation = localSeasonsGreetings ) { for ( var ii = 0 ; ii < happyHolidays . Length ; ii ++ ) { seasonsGreetingsLocation [ ii ] = happyHolidays [ ii ] ; } } Console . WriteLine ( "Modification has run." ) ; Console . WriteLine ( $ "Are the two variables equal: " + $ "{localSeasonsGreetings.Equals(SeasonsGreetings)}" ) ; }

At first glance, the local string (localSeasonsGreetings) should be modified, and the class level string (SeasonsGreetings) should be unchanged.

[TestMethod] public void MutateCopyOfSeasonsGreetings() { var immutableStrings = new ImmutableStringsExample(); immutableStrings.MutateCopyofSeasonsGreetings(); Console.WriteLine ($"Class Variable Seasons Greetings After Method Run {immutableStrings.SeasonsGreetings}"); } 1 2 3 4 5 6 7 8 9 10 11 [ TestMethod ] public void MutateCopyOfSeasonsGreetings ( ) { var immutableStrings = new ImmutableStringsExample ( ) ; immutableStrings . MutateCopyofSeasonsGreetings ( ) ; Console . WriteLine ( $ "Class Variable Seasons Greetings After Method Run {immutableStrings.SeasonsGreetings}" ) ; }

In this example, the unit test runs the method which prints out the values of the local string and the class level string, and then the unit test prints out the value of the class level string.

The local string is modified, and the class level string is also changed. Why did this happen? The answer lies in String Interning. When a literal string becomes accessible by the program, it is checked against the intern pool (a table which houses a unique instance of each literal string or ones that have been programmatically added). If the literal already exists within that table, a reference to the string in the table is returned instead of creating a new instance. Since the two string entries in the example are the same (Bah Humbug!!!!!), the runtime actually creates one reference for both of them, and hence, when one is modified, the other is affected.

So what happens if we piece together the string at runtime from two constants?

public unsafe void MutatePiecedTogetherSeasonsGreetings() { string exclamations = "!!!!!"; string localSeasonsGreetings = "Bah Humbug" + exclamations; Console.WriteLine($"Local Seasons Greetings: {localSeasonsGreetings}"); Console.WriteLine($"Class Variable Seasons Greetings: {SeasonsGreetings}"); Console.WriteLine ($"Are the two variables equal: " + $"{localSeasonsGreetings.Equals(SeasonsGreetings)}"); var happyHolidays = "Happy Holidays!"; fixed (char* seasonsGreetingsLocation = localSeasonsGreetings) { for (var ii = 0; ii < happyHolidays.Length; ii++) { seasonsGreetingsLocation[ii] = happyHolidays[ii]; } } Console.WriteLine("Modification has run."); Console.WriteLine ($"Are the two variables equal: " + $"{localSeasonsGreetings.Equals(SeasonsGreetings)}"); } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 public unsafe void MutatePiecedTogetherSeasonsGreetings ( ) { string exclamations = "!!!!!" ; string localSeasonsGreetings = "Bah Humbug" + exclamations ; Console . WriteLine ( $ "Local Seasons Greetings: {localSeasonsGreetings}" ) ; Console . WriteLine ( $ "Class Variable Seasons Greetings: {SeasonsGreetings}" ) ; Console . WriteLine ( $ "Are the two variables equal: " + $ "{localSeasonsGreetings.Equals(SeasonsGreetings)}" ) ; var happyHolidays = "Happy Holidays!" ; fixed ( char * seasonsGreetingsLocation = localSeasonsGreetings ) { for ( var ii = 0 ; ii < happyHolidays . Length ; ii ++ ) { seasonsGreetingsLocation [ ii ] = happyHolidays [ ii ] ; } } Console . WriteLine ( "Modification has run." ) ; Console . WriteLine ( $ "Are the two variables equal: " + $ "{localSeasonsGreetings.Equals(SeasonsGreetings)}" ) ; }

Notice in the example code above, the localSeasonsGreetings literal is changed to:

string exclamations = "!!!!!"; string localSeasonsGreetings = "Bah Humbug" + exclamations; 1 2 string exclamations = "!!!!!" ; string localSeasonsGreetings = "Bah Humbug" + exclamations ;

Since the local variable instance of Bah Humbug!!!!! was created when the method was run (and is not a literal), the CLR created a new instance of this string. When this local instance was modified, the class level variable instance was not differing from the previous example.

What happens when the same string value is in different assemblies?

[TestMethod] public void MutateTestCopyOfSeasonsGreetings() { string testSeasonsGreetings = "Bah Humbug!!!!!"; Console.WriteLine($"Before running method: {testSeasonsGreetings}"); var immutableStrings = new ImmutableStringsExample(); immutableStrings.MutateSeasonsGreetingsString(); Console.WriteLine($"After running method: {testSeasonsGreetings}"); } 1 2 3 4 5 6 7 8 9 10 11 12 [ TestMethod ] public void MutateTestCopyOfSeasonsGreetings ( ) { string testSeasonsGreetings = "Bah Humbug!!!!!" ; Console . WriteLine ( $ "Before running method: {testSeasonsGreetings}" ) ; var immutableStrings = new ImmutableStringsExample ( ) ; immutableStrings . MutateSeasonsGreetingsString ( ) ; Console . WriteLine ( $ "After running method: {testSeasonsGreetings}" ) ; }

Based on the previous examples, it works how you would expect it to. Since String Interning is controlled by the CLR and not during compile time, which assembly the string is located in doesn’t matter. All literals loaded into memory are added to the same pool, so modifying the value in one assembly affects all other instances in the entire application.

Up until this point, we’ve only seen the effects of String Interning on instances of a string. What happens if we return a literal from a static method? To test this, I added a method to return “Bah Humbug!!!!!” to the ImmutableStringsExample.

public static string ReturnLocalCopyOfSeaonsGreetingsFromStaticMethod() { return "Bah Humbug!!!!!"; } 1 2 3 4 public static string ReturnLocalCopyOfSeaonsGreetingsFromStaticMethod ( ) { return "Bah Humbug!!!!!" ; }

[TestMethod] public void MutateStaticString() { var immutableStrings = new ImmutableStringsExample(); immutableStrings.MutateSeasonsGreetingsString(); Console.WriteLine( $"Static method variable: " + $"{ImmutableStringsExample. ReturnLocalCopyOfSeaonsGreetingsFromStaticMethod()}"); } 1 2 3 4 5 6 7 8 9 10 11 12 [ TestMethod ] public void MutateStaticString ( ) { var immutableStrings = new ImmutableStringsExample ( ) ; immutableStrings . MutateSeasonsGreetingsString ( ) ; Console . WriteLine ( $ "Static method variable: " + $ "{ImmutableStringsExample. ReturnLocalCopyOfSeaonsGreetingsFromStaticMethod()}" ) ; }

The static method was called after the modification method ran, and it did not change. We could assume that since the method creates a new string instance, and the static method after we modified the interned “Bah Humbug!!!!!” string reference that it couldn’t find it and created a new instance. Now the question is, “Is this method deterministic?” Will this method always return a new instance of “Bah Humbug!!!!!!”?

[TestMethod] public void MutateStaticStringCallMethodTwice() { var immutableStrings = new ImmutableStringsExample(); Console.WriteLine( $"Static method variable first call: " + $"{ImmutableStringsExample.ReturnLocalCopyOfSeaonsGreetingsFromStaticMethod()}"); immutableStrings.MutateSeasonsGreetingsString(); Console.WriteLine( $"Static method variable second call: " + $"{ImmutableStringsExample.ReturnLocalCopyOfSeaonsGreetingsFromStaticMethod()}"); } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 [ TestMethod ] public void MutateStaticStringCallMethodTwice ( ) { var immutableStrings = new ImmutableStringsExample ( ) ; Console . WriteLine ( $ "Static method variable first call: " + $ "{ImmutableStringsExample.ReturnLocalCopyOfSeaonsGreetingsFromStaticMethod()}" ) ; immutableStrings . MutateSeasonsGreetingsString ( ) ; Console . WriteLine ( $ "Static method variable second call: " + $ "{ImmutableStringsExample.ReturnLocalCopyOfSeaonsGreetingsFromStaticMethod()}" ) ; }

Clearly the answer is no. The time when the application calls the static method, determines its behavior. Now what happens with a non-static method? Are the same methods in different objects the same?

[TestMethod] public void InstaniateSecondObjectAndCheckSeasonsGreetingString() { var immutableStrings = new ImmutableStringsExample(); Console.WriteLine(immutableStrings.ReturnLocalCopyOfSeaonsGreetings()); immutableStrings.MutateSeasonsGreetingsString(); var immutableStringsSecondInstance = new ImmutableStringsExample(); Console.WriteLine(immutableStringsSecondInstance.ReturnLocalCopyOfSeaonsGreetings()); } 1 2 3 4 5 6 7 8 9 10 11 12 [ TestMethod ] public void InstaniateSecondObjectAndCheckSeasonsGreetingString ( ) { var immutableStrings = new ImmutableStringsExample ( ) ; Console . WriteLine ( immutableStrings . ReturnLocalCopyOfSeaonsGreetings ( ) ) ; immutableStrings . MutateSeasonsGreetingsString ( ) ; var immutableStringsSecondInstance = new ImmutableStringsExample ( ) ; Console . WriteLine ( immutableStringsSecondInstance . ReturnLocalCopyOfSeaonsGreetings ( ) ) ; }

Non static methods work the same as static ones in this regard. Once ran, the CLR will make updates and return a reference to the same object.

With the above examples, we see that Strings in .NET are really a lot more complicated than they initially let on. The runtime handles a lot of complicated optimizations, and there is a lot of work that goes on behind the scenes to ensure that efficiency. With those efficiencies come certain restrictions, such as immutability, but in the whole scope, those small restrictions can be managed and used to benefit the application.

The code for this post can be found on GitHub

Share this: Twitter

Tumblr

LinkedIn

Facebook

Pocket

Reddit

