Very boring disclaimer: I’ve decided not to use 3rd party providers for code highlighting and table inlining, because I wanted the article to be in “plain Medium” format. No wait time for code loading! So, basically as long as Medium lives, this article is safe!

But with that approach comes some inconveniences. For example, in several places the code is hard to understand because of default Medium formatting for code blocks. Also, Medium doesn’t natively support Markdown tables yet, so all tables are in ASCII format.

Also, any suggestions and proposals are welcome!

Inception

Let’s assume we have the following type

public struct Person

{

public int Age { get; set; }

public string Name { get; set; }

}

If we write the following lines, we’ll get a compile-time error

var john = new Person { Age = 20, Name = "John" };

var ann = new Person { Age = 22, Name = "Ann" }; var isSamePerson = john == ann; // Compile-time error!

By default, struct can't be compared by == , but class can. Basically, in order to be able to compare struct with == operator you must explicitly implement it. Which is a no-brainer at all, but the approach you actually choose to do it may be tricky. I want to show you several ways of doing it and describe pros and cons of each approach.

Let’s start with a couple of available options

[Option 1] implement via Object.Equals(object obj) , but it boxes struct value which is not good

implement via , but it boxes value which is not good [Option 2] implement directly with custom method Equals(T value) , no boxing

implement directly with custom method , no boxing [Option 3] implement via IEquatable<T> to be fully compliant with some of .NET APIs (more on that later), no boxing

Note: for every Equals() override you must also supply GetHashCode() . It's purposely omitted for this article, as I want to focus on the subject at hand.

[Option 1] Object.Equals(object obj)

The most simple and straightforward approach, but also the most expensive one.

public struct Person

{

// ... properties public static bool operator ==(Person first, Person second)

{

return Equals(first, second);

} public static bool operator !=(Person first, Person second)

{

// or !Equals(first, second), but we want to reuse the existing comparison

return !(first == second);

}

}

Ok, so what actually happens behind the scenes? Basics first!

All class types inherit from object , all struct types inherit from ValueType object which in turn inherits from object .

object

/ \

class ValueType

\

struct

So far, so good. Each object has virtual Equals(object obj) method. So, Equals for class types uses referential comparison, but for ValueType uses structural comparison. By default, for class we compare references, but for struct we compare actual values. Let's dig a little bit deeper into structural comparison.

I will take the direct excerpt from MSDN documentation

The Equals(Object) implementation provided by ValueType performs a byte-by-byte comparison for value types whose fields are all value types, but it uses reflection to perform a field-by-field comparison of value types whose fields include reference types.

It’s a very interesting statement, but it’s not very true. The point is, that byte-to-byte comparison happens only if declared struct is tightly packed. Let's briefly outline it

struct comparison via Equals(object)

/ \

tightly packed not tightly packed

/ \

byte-to-byte reflection (field-to-field)

Oh, mine.. What does it mean at all? Simply speaking (I’m not an expert in inner layout of memory in .NET ) it means that "if struct contains only value types of the same type (not always true), then it will do a fast (byte-to-byte) comparison". Ok, I'm lost! Some examples are desperately needed here!

// tightly packed

public struct PersonV1

{

public int Age { get; set; }

public int Salary { get; set; }

}



// not tightly packed

public struct PersonV2

{

public int Age { get; set; }

public double Salary { get; set; }

}

But how can we test the speed and allocation\deallocation correctly? For this task I use BenchmarkDotNet library.

[MemoryDiagnoser, ShortRunJob]

public class StructComparisonBenchmark

{

private const int RUNS = 100000;



[Benchmark]

public void TightlyPacked()

{

for (var i = 0; i < RUNS; i++)

{

var p1 = new PersonV1();

var p2 = new PersonV1();

p1.Equals(p2);

}

}



[Benchmark]

public void NotTightlyPacked()

{

for (var i = 0; i < RUNS; i++)

{

var p1 = new PersonV2();

var p2 = new PersonV2();

p1.Equals(p2);

}

}

}

Note: tests are run in not a sterile environment. It means that other programs are not closed and may interfere with the final results. The actual processor you run the tests on is also crucial.

A couple of words about the attributes

[Benchmark] - actual code to test

- actual code to test [MemoryDiagnoser] - memory usage statistics

- memory usage statistics [ShortRunJob] - short runs (by default tests are run longer)

On my machine it gives the following results

+------------------+-----------+-----------+-------------+---------+

| Method | Mean | Error | Gen 0 | Allocated

+------------------+-----------+-----------+-------------+---------+

| TightlyPacked | 6.154 ms | 5.618 ms | 2031.2500 | 3.05 MB

| NotTightlyPacked | 73.137 ms | 29.226 ms | 7833.3333 | 11.83 MB

+------------------+-----------+-----------+-------------+---------+

Mean : Arithmetic mean of all measurements

Error : Half of 99.9% confidence interval

Gen 0 : GC Generation 0 collects per 1k Operations

Allocated : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B) 1 ms : 1 Millisecond (0.001 sec)

Note: I want to point out that Allocated is a cumulative value between all benchmark runs. It’s not get garbage collected!

Lyrical digression: I’ve excluded StdDev from the results, because Medium doesn’t support tables natively. Sigh..

What is important here is the lesser a value the better. So, we can clearly see that tightly packed struct takes much lesser time (also consumes much lesser memory) to do the comparison, than not tightly packed struct .

But you can’t always make a struct tightly packed. Yes, you can try to play around with StructLayoutAttribute, but it's not recommended by Microsoft. Which leads us to [Option 2] .

[Option 2] T.Equals(T obj)

Ok, let’s write the next ‘Person’ version by directly passing a value into Equals(T value) .

public struct PersonV3

{

public int Age { get; set; }

public int Salary { get; set; } public bool Equals(PersonV3 person)

=> person.Age == Age && person.Salary == Salary;

}

Here, we explicitly implemented struct comparison and it means that no default Equals method is called, therefore there is no boxing

o reflection

o byte-to-byte comparison.

Let’s add a test for this case

// method added to 'StructComparisonBenchmark' class

[Benchmark]

public void CustomEquals()

{

for (int i = 0; i < RUNS; i++)

{

var p1 = new PersonV3();

var p2 = new PersonV3();

p1.Equals(p2);

}

}

The results

+------------------+-----------+-----------+-------------+---------+

| Method | Mean | Error | Gen 0 | Allocated

+------------------+-----------+-----------+-------------+---------+

| TightlyPacked | 6.154 ms | 5.618 ms | 2031.2500 | 3.05 MB

| NotTightlyPacked | 73.137 ms | 29.226 ms | 7833.3333 | 11.83 MB

| CustomEquals | 51.43 us | 18.26 us | - | 0 B

+------------------+-----------+-----------+-------------+---------+

Note: it’s highly unrealistic to receive the same results for TightlyPacked and NotTightlyPacked tests between different runs, so I just grabbed previous ones and merged them with CustomEquals .

Wow! Just plain 0 for Gen 0 and Allocated fields. It seems like a win! But can we say that we’re done? Not if you try to use the current code with existing .NET API.

[Option 3] .NET land with IEquatable.Equals(T value)

Firstly, let’s create another version of ‘Person’.

public struct PersonV4 : IEquatable<PersonV4>

{

public int Age { get; set; }

public int Salary { get; set; } public bool Equals(PersonV4 person)

=> person.Age == Age && person.Salary == Salary;

}

The only difference between PersonV3 and PersonV4 is that the latter implements IEquatable<T> interface.

Now, suppose we have a list of persons and want to check if the given list contains a certain ‘Person’ by using List<T>.Contains(T obj) method.

Let’s write another benchmark for this case

[MemoryDiagnoser, ShortRunJob]

public class ListContainsBenchmark

{

private const int RUNS = 10000; private List<PersonV3> _personV3s;

private List<PersonV4> _personV4s; [GlobalSetup]

public void GlobalSetup()

{

_personV3s = Enumerable

.Range(0, RUNS)

.Select(i => new PersonV3 { Age = i, Salary = i })

.ToList();

_personV4s = Enumerable

.Range(0, RUNS)

.Select(i => new PersonV4 { Age = i, Salary = i })

.ToList();

} [Benchmark]

public void ListContainsWithoutIEquatable()

{

for (var i = 0; i < RUNS; i++)

_personV3s.Contains(new PersonV3 { Age = 0, Salary = 0});

} [Benchmark]

public void ListContainsWithIEquatable()

{

for (var i = 0; i < RUNS; i++)

_personV4s.Contains(new PersonV4 { Age = 0, Salary = 0});

}

}

Note: the point of benchmark is just to show allocations that can happen in real app. It’s not for a production use!

A new part of benchmark is [GlobalSetup] attribute. It'll be called only once for all tests and supply us with initial values.

Now, let’s the tests be commenced!

+-------------------------------+------------+----------+----------+

| Method | Mean | Gen 0 | Allocated

+-------------------------------+------------+----------+----------+

| ListContainsWithoutIEquatable | 8,891.4 us | 781.2500 | 1240086 B

| ListContainsWithIEquatable | 133.6 us | - | 0 B

+-------------------------------+------------+----------+----------+

Wow! This is really interesting. ListContainsWithIEquatable test doesn't allocate anything! The only part that is different between PersonV3 and PersonV4 classes its that the latter implements IEquatable<T> interface. What's going on here?

Going deeper into .NET API

Let’s get a little bit deeper into the actual List<T>.Contains(T obj) implementation.

[__DynamicallyInvokable]

public bool Contains(T item)

{

if ((object) item == null)

{

for (int index = 0; index < this._size; ++index)

{

if ((object) this._items[index] == null)

return true;

}

return false;

}

var equalityComparer = EqualityComparer<T>.Default;

for (int index = 0; index < this._size; ++index)

{

if (equalityComparer.Equals(this._items[index], item))

return true;

}

return false;

}

At first glance it's really difficult to see why the behaviour between the runs are so different, but there is one interesting call in this method - EqualityComparer<T>.Default . So, let's see what's inside of it.

Note: all code are run under .NET Framework 4.7.1 and Resharper decompiler is used to get the source code. .NET Core implementation may vary.

public abstract class EqualityComparer<T> : IEqualityComparer, IEqualityComparer<T>

{

private static readonly EqualityComparer<T> defaultComparer

= EqualityComparer<T>.CreateComparer();



[__DynamicallyInvokable]

public static EqualityComparer<T> Default

{

[__DynamicallyInvokable] get

{

return EqualityComparer<T>.defaultComparer;

}

}

}

So, for each EqualityComparer<T>.Default call, CreateComparer() method gets executed. New comparer is created and cached for each generic T type into a private static readonly defaultComparer field. Let's now see what's happening inside CreateComparer() .

public abstract class EqualityComparer<T> : IEqualityComparer, IEqualityComparer<T>

{

[SecuritySafeCritical]

private static EqualityComparer<T> CreateComparer()

{

RuntimeType genericParameter = (RuntimeType) typeof (T);

if (typeof (IEquatable<T>).IsAssignableFrom((Type) genericParameter))

return (EqualityComparer<T>) RuntimeTypeHandle.CreateInstanceForAnotherGenericParameter((RuntimeType) typeof (GenericEqualityComparer<int>), genericParameter);



// further implementation



// if no implementation were chosen

return (EqualityComparer<T>) new ObjectEqualityComparer<T>();

}

}

I haven’t included the full implementation, as it really doesn’t matter much. What really matters is several lines

if (typeof (IEquatable<T>).IsAssignableFrom((Type) genericParameter))

return (EqualityComparer<T>) RuntimeTypeHandle.CreateInstanceForAnotherGenericParameter((RuntimeType) typeof (GenericEqualityComparer<int>), genericParameter);

and

return (EqualityComparer<T>) new ObjectEqualityComparer<T>();

CreateComparer() decides what comparer to return. If T type doesn't implement IEquatable<T> interface, then ObjectEqualityComparer<T> will be created. Let's look inside.

[Serializable]

internal class ObjectEqualityComparer<T> : EqualityComparer<T>

{

[Pure]

public override bool Equals(T x, T y) {

if (x != null) {

if (y != null) return x.Equals(y);

return false;

}

if (y != null) return false;

return true;

}

}

Note: only important parts are shown

So, it uses default object comparison which boxes the value. That’s why PersonV3 generates so much garbage. If type implements IEquatable<T> then RuntimeTypeHandle.CreateInstanceForAnotherGenericParameter() will be called. You know the drill, let's go inside!

[SecurityCritical]

[MethodImpl(MethodImplOptions.InternalCall)]

internal static extern object CreateInstanceForAnotherGenericParameter(RuntimeType type, RuntimeType genericParameter);

Oh.. Mkay. Yes, it seems we've reached the end. Alrighty, let's try to go deeper nontheless. When we see extern keyword, it means that implementation is usually in C++ . Thanks to .NET Core all C++ code is open! Here the file which contains CreateInstanceForAnotherGenericParameter implementation.

Note: before .NET Core era, all implementation behind extern calls was called SSCLI.

To be honest I’m not an expert of C++ and can't reliably tell what's happening in CreateInstanceForAnotherGenericParameter , but with this call we don't blindly use Equals(object obj) which boxes the value, but IEquatable<T>.Equals(T value) .

Epilogue

That concludes our journey into the lands of struct equality comparison.