The Last Chapter

With the release of Delphi XE6 and Embarcadero’s emphasis on Quality, Performance, and Stability (QPS), I wanted to see for myself the level of improvement, especially in performance. Delphi XE6 is definitely faster and more responsive than the last few versions, especially in FMX, but I wanted to see if I could quantify the performance improvement. Before starting, I made some predictions about what I would see when comparing Delphi 2010-XE6. Let’s see how I did:

EXE size will probably increase with every version of Delphi. This turned out to be mostly true. Each version of Delphi has increased the EXE size over the one before… EXCEPT this ignores the one proverbial elephant in the room, Delphi XE3. Delphi XE3 massively increased EXE sizes, both in VCL and FMX. Thankfully, Delphi XE4 reduced XE3’s excessive weight gain. However, EXE size has been growing back to XE3 levels with every version since then. It should be noted that VCL EXE sizes have been growing slower than FMX EXE sizes.

This turned out to be mostly true. Each version of Delphi has increased the EXE size over the one before… EXCEPT this ignores the one proverbial elephant in the room, Delphi XE3. Delphi XE3 massively increased EXE sizes, both in VCL and FMX. Thankfully, Delphi XE4 reduced XE3’s excessive weight gain. However, EXE size has been growing back to XE3 levels with every version since then. It should be noted that VCL EXE sizes have been growing slower than FMX EXE sizes. FMX executables will be larger than VCL executables. This was definitely true. This was completely expected as FMX controls are non-native controls (i.e., they don’t use OS level equivalents) so all the drawing and interaction code must be compiled into the executable.

This was definitely true. This was completely expected as FMX controls are non-native controls (i.e., they don’t use OS level equivalents) so all the drawing and interaction code must be compiled into the executable. FMX executables will be slower than VCL executables, though each new version of Delphi for a platform should improve. This was mostly true. Except for where the VCL version is using a software renderer for heavy graphics drawing vs the FMX hardware renderer, the FMX versions are slower than their VCL counterparts. There was one outlier, when strings are added to a TMemo and BeginUpdate/EndUpdate have not been called, the Delphi XE6 FMX version was faster than the VCL feature. However, the discrepancy is easily explained as the VCL TMemo draws every string as it is added while the FMX TMemo does not. The part that is not completely true is that each new version of Delphi would be improving the FMX speed. Speeds were all over the place, and earlier versions were sometimes significantly faster. Delphi XE6 is definitely faster than XE5 though.

This was mostly true. Except for where the VCL version is using a software renderer for heavy graphics drawing vs the FMX hardware renderer, the FMX versions are slower than their VCL counterparts. There was one outlier, when strings are added to a TMemo and BeginUpdate/EndUpdate have not been called, the Delphi XE6 FMX version was faster than the VCL feature. However, the discrepancy is easily explained as the VCL TMemo draws every string as it is added while the FMX TMemo does not. Delphi XE6 is definitely faster than XE5 though. Win32 and Win64 compilation should be faster than other platforms. This was true as well. The Win32 and Win64 compilers (written by Embarcadero) are fastest. However, the OSX version (also written by Embarcadero) is very close. The iOS and Android versions are much slower, even when not counting the deployment step.

This was true as well. The Win32 and Win64 compilers (written by Embarcadero) are fastest. However, the OSX version (also written by Embarcadero) is very close. The iOS and Android versions are much slower, even when not counting the deployment step. Windows FMX executables will be faster than other platforms’ FMX executables. This was mostly true to true. Since the test machines were not comparable, it was difficult to really test this. However, when drawing the RSCL directly to the canvas, the OSX implementation was comparable to the Win32 implementation. With the exception of the Clock, the OSX versions cleaned the Win32 versions clocks, being anywhere from 1.2x-4.2x faster. It must be noted that the Windows box (Windows 7 64-bit Intel I7 930 @ 2.8 GHz CPU, ATI HD 5700 graphics card, and 6 GB RAM) and the Mac Mini (OSX 64-bit Intel I7 @ 2.3 GHz, integrated graphics, and 4GB RAM) are not exactly comparable. Admittedly, the ATI HD 5700 is almost 5 years old now, but the Mac Mini uses integrated graphics. Examining the outputs closely, there are slight differences in the output (see the dark lines in the Car). Generally, however, the output from the OSX versions are excellent. This hypothesis would need to be more rigorously investigated.

Comparing the competitors…

To make my final evaluation, I wanted to be as objective as possible and use concrete numbers. I decided to ignore EXE size (Delphi 2010 wins! :-)) and compilation speed (more mixed but generally earlier versions of Delphi). I concentrated solely on performance metrics. This means I ignored features (Delphi XE6 wins!), number of platforms (Delphi XE6 wins), and stability (???? wins!) For each test application, I averaged the scores (e.g., the Hello World Listbox performance score is an average of the TListBox scores). This does mean that the winner of an exponential test (e.g., 1000s of items in a TListBox) can win even if their performance with less items was not the best. I then normalized the averages (e.g., the winner is equal to 1 and every other version is between 0 and 1 depending on their average). Finally, I added each test score together to get a maximum score per platform. For example, the VCL 32-bit score includes the normalized performance scores from the Hello World project (ListBox and Memo), the IECS Advanced Console and the IECS Basic Console (Max of 4 points). The OSX score includes the normalized performance scores from the Hello World project (ListBox and Memo), the IECS Advanced Console, the IECS Basic Console, RSCL Drawing project, and the RSCL Primitives project (Max of 6 points). In the graphs below, you can easily see how the normalized scores contributed to each Delphi version’s score. Note that my scores are just for the tests in this series of blog posts. Other tests would give vastly different results. This blog post is not meant to be a recommendation of one Delphi version over another.

Now for the awards ceremony…

Surprisingly, Delphi XE6 wins the gold award in my performance tests for Win32 VCL. The differences are extremely minor between all versions of Delphi so this coveted award could have gone to any of them 🙂 Delphi XE3 manages second place and a silver medal. Delphi XE scores the third place bronze award.

For Win64 VCL (dropping Delphi 2010 and XE from the running), Delphi XE2 manages to barely eke out the gold award. Delphi XE6 wins silver and XE5 bronze.

Moving to FMX, the performance differences get much larger. Delphi XE3, despite its huge EXE sizes and poor Memo performance, produces the fastest FMX executables on Windows in these tests to win a gold medal. Its ListBox performance is what really stands out. Delphi XE4 makes a very strong second place showing for silver. Bronze goes to Delphi XE6.

Going to 64-bit, Delphi XE3 again wins the gold with Delphi XE4 getting the silver. Delphi XE6 comes again in third place, though its score improved noticeably over 32-bit.

Switching to the Mac, Delphi XE4 wins its first gold medal. Delphi XE6 sneaks past XE3 to win the silver.

Shifting to mobile, we are down to Delphi XE4, XE5, and XE6. Delphi XE4 wins another gold medal for the iOS platform on its solid memo performance. Delphi XE6 gets second place for the silver and XE5 trails for the bronze.

For our last platform, Android, Delphi XE6 wins handily, easily beating XE5.

Conclusion

So what does this mean? How did Delphi XE6 perform on its mantra of “Quality, Performance, and Stability?” As far as this series of performance blog posts are concerned, Delphi XE6 did neither as well as I hoped nor as bad as I feared. It only outright won a platform twice (and one of those was by a whisker on Win32 VCL). However, Delphi XE6 was a solid performer and competitive for all platforms. Compared to its immediate successor (Delphi XE5), it is a huge advance in performance and nearly brings it back up to the levels of Delphi XE3 and XE4, both of which performed surprisingly well in FMX.

If you do a medal count of the different platforms and give 3 points for Gold, 2 points for Silver, and 1 point for Bronze, Delphi XE6 racks up the most points. Even if you ignore the Android platform, Delphi XE6 still did 1 point better than XE4. If you then also ignored the iOS platform, Delphi XE6 would still tie with XE3 at 9 points. Considering that it adds both iOS and Android platforms, seems to have fewer issues compared to Delphi XE3 and XE4, and vastly superior performance compared to Delphi XE5, I would say that Embarcadero achieved their goal. Almost a month and over 12,000 words later, I have finally finished this series of blog posts on the performance comparison from Delphi 2010 to Delphi XE6. I hope you have enjoyed them. It was a ton of work, but I think it was worth it. Happy CodeSmithing!