Introduction

One of the things I really enjoy while writing variations of algorithms is comparing the total run-time between them. I find it rewarding and informative to start with a minimal amount of iterations, say 10 merges, where QuickFind (which has a performance/complexity of (O)n2) and WeightedQuickUnion ((O)m * log (n)) look like the have almost the same performance. From there I like to add more and more iterations to see how the algorithms perform. The time savings really starts to become apparent with just a few hundred merges.

Note: if you aren't familiar with Big-O notation in computer science check out this primer from Rob Bell.

What I decided to do for today's demo was include four algorithms from Robert Sedgewick's Algorithms in Java, Third Edition. For today's demo we are going to compare the performance between QuickFind, QuickUnion, WeightedQuickUnion and include our demo from Friday, WeightedQuickUnionWithPathCompression.

I have updated the repository so that we send in a single HTTP request to an HTTPTrigger. The HTTPTrigger writes a message to a Service Bus Topic and creates a Subscription for each of our algorithms. The four algorithms are ServiceBusTopicTriggers that listen for a message on their Subscription, do the sorting, write the input and output to Blob Storage and then finally write the performance data to Azure Table Storage.

It's pretty amazing what I was able to create with a pretty minimal amount of code. That being said, there are some limitations that can be frustrating and we'll cover that in the rants section.

Errata

The MS Dev Show featured a conversation about serverless and logic apps with Jeff Hollan. Jeff is a Project Manager on the Azure Logic Apps team. He provides a great 100-level overview of how Azure Functions operate as well as some insightful comparisons between Functions and Logic Apps. The discussion gets interesting when Jeff answers the question of whether "serverless" would mean the end of Docker. Check it out as you follow along with today's demos!

Jeff Hollan talking about serverless. I haven't quite finished the episode but the interview with Carl and Richard is different enough to warrant a separate listen.

A Quick Rant

One of the limitations I hit right off the bat was the maximum message size in ServiceBus messages. The maximum message size is limited to 64kb. Normally, this would be more than enough. However, because I am trying to merge some fairly sizable arrays I hit that size limitation by simply writing the array to merge from and merge to.

However, what was really frustrating was the exception messaging that is returned to the Function. I wrapped the call to write to Table in a try/catch (which we would if this was production code) to further inspect the message sent back. The error message was something about not being able to write element[0] to Table Storage. Not exactly helpful as the messaging should be able to convey to the developer exactly which column is too large.

The optimal solution which would allow me to bypass this entirely would be to have the MergeTrigger (which I will introduce shortly) take in a JSON message with how many elements I want to merge, create the arrays and then write the arrays to Blob Storage as text files and write the URI of the paths to the ServiceBus Topic. The ServiceBus Topic Triggers would then get the message, use the URI to locate and read the input and then go about doing the merge. We would still be able to write the same output to Blob Storage and to Table Storage so there would be no changes necessary there. In fact, let's bookmark that and use it for a future topic on refactoring.

Ultimately this would allow me to send in millions, even billions, limited ultimately by the size of a 32-bit integer, of elements and merge them. At which point I would expect to see WeightedQuickUnionWithPathCompression be the true champion it really is.

Prerequisites

Today's demo will use our first "paid" service in Azure (okay, sure Functions do cost money but they're really inexpensive to run). Azure ServiceBus Topics aren't expensive but they aren't free. This will cost $10/month to run. If you want to stay in the "free" tier you can use multiple different ServiceBus Queues or Storage Queue. Either way, there's multiple ways to do this.

The source for today's demo is located on and deployed from GitHub. If you want to save some copy/pasting grab it here: AzureFunctionsBlogDemos. Feel free to clone the repository and use it for a reference as we go through the demo.

Adding the Topic

Let's start by adding the Service Bus Topic to our solution.

First, head over to the Azure Portal and login with your Microsoft Account

Next, click the "+" in the upper left-hand corner. Search for "Service Bus". Then Click "Create".

Next, give the ServiceBus a Name, choose your pricing tier Note: Topics are only available with the Standard pricing tier and will cost ~$10 per month. Choose your Subscription, your Resource Group and Location. Once that is done click the "Create" button and your ServiceBus namespace will be created.

Once the ServiceBus is created we can access it through the Resource Group we assigned it to. We don't need to create our Topic as the code for the MergeTrigger will actually do this for us. For now, all we need to do is get the access key to write to our namespace.

Under settings click "Shared access policies".

We are going to grab the "CONNECTION STRING - PRIMARY KEY". Copy this to Notepad or Notepad++ as we are going to use this shortly.

Once we've retrieved the Connection String to our ServiceBus Namespace we are going to need to add it to our Functions App. So, click on over to your Functions App and open it. Click "Platform features" and find "Application settings" under "GENERAL SETTINGS".

By now, this should be a familiar sight.

Under the Application Settings we are going to add a setting for our ServiceBus endpoint. Under Key add "AzureWebJobsServiceBus" and under Value add the connection string we copied to Notepad/Notepad++. Once that is done be sure to click the "Save" button to save your new setting!

Lastly, we will need to update the local.settings.json for our Solution using the Azure CLI. The values are encrypted so we can't just add the key/value pair to the file itself: func azure login (optional) func azure account list (optional) func azure account set func azure functionapp list func azure functionapp fetch-app-settings [name]

Adding the Code

MergeTrigger.cs

First, let's add the code that will write to our ServiceBus Topics. The code actually does all the work which I could've accomplished via a binding instead. As I was learning how to write to a ServiceBus Topic myself, I decided to go this way rather than use the binding configuration. Either way is essentially the same.

using Microsoft.Azure; using Microsoft.Azure.WebJobs.Host; using Microsoft.ServiceBus; using Microsoft.ServiceBus.Messaging; using Newtonsoft.Json; using System; using System.Diagnostics; using System.Net; using System.Net.Http; using System.Threading.Tasks; using static AzureFunctionsBlogDemos.Shared.Enums; namespace AzureFunctionsBlogDemos.Merging { public class MergeTrigger { public static async Task Run(HttpRequestMessage req, TraceWriter log) { log.Info("MergeTrigger has processed a request."); // Parse request input string jsonContent = await req.Content.ReadAsStringAsync(); var inputArrays = JsonConvert.DeserializeObject (jsonContent); log.Info($"Inputs: {string.Join(",", inputArrays.NumberToUnionFrom)}, {string.Join(",", inputArrays.NumberToUnionTo)}"); if (inputArrays.NumberToUnionFrom.Length != inputArrays.NumberToUnionTo.Length) { return req.CreateResponse(HttpStatusCode.OK, new { message = "Number of items to union do not match." }); } Stopwatch stopwatch = new Stopwatch(); stopwatch.Start(); AddMessageToTopic(inputArrays); stopwatch.Stop(); var performance = new MergePerformance(); performance.Runtime = stopwatch.Elapsed; return req.CreateResponse(HttpStatusCode.OK, new { message = JsonConvert.SerializeObject(inputArrays) }); } public static void AddMessageToTopic(Merging.Array input) { // Create the topic if it does not exist already. string connectionString = CloudConfigurationManager.GetSetting("AzureWebJobsServiceBus"); var namespaceManager = NamespaceManager.CreateFromConnectionString(connectionString); var topicName = "AlgorithmsMerge"; // Configure Topic Settings. TopicDescription td = new TopicDescription(topicName); td.MaxSizeInMegabytes = 5120; td.DefaultMessageTimeToLive = new TimeSpan(0, 1, 0); if (!namespaceManager.TopicExists(topicName)) { namespaceManager.CreateTopic(td); } foreach (string algorithmName in Enum.GetNames(typeof(MergeAlgorithms))) { if (!namespaceManager.SubscriptionExists(topicName, algorithmName)) { namespaceManager.CreateSubscription(topicName, algorithmName); } } TopicClient Client = TopicClient.CreateFromConnectionString(connectionString, topicName); Client.Send(new BrokeredMessage(input)); } } }

function.json

{ "bindings": [ { "authLevel": "function", "direction": "in", "methods": [ "post" ], "name": "req", "type": "httpTrigger", "webHookType": "genericJson" }, { "direction": "out", "name": "$return", "type": "http" } ], "disabled": false, "entryPoint": "AzureFunctionsBlogDemos.Merging.MergeTrigger.Run", "scriptFile": "..\\bin\\AzureFunctionsBlogDemos.dll" }

I have refactored "Arrays.cs" a bit since last week's post. In fact, I have even changed the name to "MergingArray.cs" to avoid namespace conflict with System.Array as that was getting annoying.

MergingArray.cs

using System.Linq; using static AzureFunctionsBlogDemos.Shared.Enums; namespace AzureFunctionsBlogDemos.Merging { /// /// QuickFindInput takes an array of integers and then uses the two other arrays to union /// public class MergingArray { public int[] NumberToUnionFrom { get; set; } public int[] NumberToUnionTo { get; set; } public int[] Output { get; set; } public static void Merge(MergingArray input, MergeAlgorithms algorithmName) { int max = 0; max = input.NumberToUnionFrom.Max() > input.NumberToUnionTo.Max() ? input.NumberToUnionFrom.Max() : input.NumberToUnionTo.Max(); // setup the Output array input.Output = new int[max + 1]; for (int i = 0; i < input.Output.Length; i++) { input.Output[i] = i; } switch(algorithmName) { case MergeAlgorithms.QuickFind: QuickFind(input); break; case MergeAlgorithms.QuickUnion: QuickUnion(input); break; case MergeAlgorithms.WeightedQuickUnion: WeightedQuickUnion(input, max); break; case MergeAlgorithms.WeightedQuickUnionWithPathCompression: WeightedQuickUnionWithPathCompression(input, max); break; } } private static void QuickFind(MergingArray quickFind) { for (int i = 0; i < quickFind.NumberToUnionTo.Length; i++) { int fromIndex = quickFind.NumberToUnionFrom[i]; int toIndex = quickFind.NumberToUnionTo[i]; int hold = quickFind.Output[fromIndex]; for (int j = 0; j < quickFind.Output.Length; j++) { if (quickFind.Output[j] == hold) { quickFind.Output[j] = quickFind.Output[toIndex]; } } } } private static void QuickUnion(MergingArray quickUnion) { for (int i = 0; i < quickUnion.NumberToUnionTo.Length; i++) { int NumberToUnionFrom = quickUnion.NumberToUnionFrom[i]; int NumberToUnionTo = quickUnion.NumberToUnionTo[i]; int j, k; for (j = NumberToUnionFrom; j != quickUnion.Output[j]; j = quickUnion.Output[j]) ; for (k = NumberToUnionTo; k != quickUnion.Output[k]; k = quickUnion.Output[k]) ; quickUnion.Output[j] = k; } } private static void WeightedQuickUnion(MergingArray weightedQuickUnion, int max) { // setup the depth array int[] depth = new int[max + 1]; for (int i = 0; i < weightedQuickUnion.Output.Length; i++) { depth[i] = 1; } for (int i = 0; i < weightedQuickUnion.NumberToUnionTo.Length; i++) { int j, k; int NumberToUnionFrom = weightedQuickUnion.NumberToUnionFrom[i]; int NumberToUnionTo = weightedQuickUnion.NumberToUnionTo[i]; for (j = NumberToUnionFrom; j != weightedQuickUnion.Output[j]; j = weightedQuickUnion.Output[j]) ; for (k = NumberToUnionTo; k != weightedQuickUnion.Output[k]; k = weightedQuickUnion.Output[k]) ; if (depth[j] < depth[k]) { weightedQuickUnion.Output[j] = k; depth[k] += depth[j]; } else { weightedQuickUnion.Output[k] = j; depth[j] += depth[k]; } } } private static void WeightedQuickUnionWithPathCompression(MergingArray quickFind, int max) { // setup the depth array int[] depth = new int[max + 1]; for (int i = 0; i < quickFind.Output.Length; i++) { depth[i] = 1; } for (int i = 0; i < quickFind.NumberToUnionTo.Length; i++) { int j, k; int NumberToUnionFrom = quickFind.NumberToUnionTo[i]; int NumberToUnionTo = quickFind.NumberToUnionTo[i]; for (j = NumberToUnionFrom; j != quickFind.Output[j]; j = quickFind.Output[j]) { quickFind.Output[j] = quickFind.Output[quickFind.Output[j]]; } for (k = NumberToUnionTo; k != quickFind.Output[k]; k = quickFind.Output[k]) { quickFind.Output[k] = quickFind.Output[quickFind.Output[k]]; } if (depth[j] < depth[k]) { quickFind.Output[j] = k; depth[k] += depth[j]; } else { quickFind.Output[k] = j; depth[j] += depth[k]; } } } } }

MergePerformance.cs

using System; namespace AzureFunctionsBlogDemos.Merging { public class MergePerformance { public TimeSpan Runtime { get; set; } public string AlgorithmName { get; set; } public int NumberOfElements { get; set; } public string PartitionKey { get; set; } public string RowKey { get; set; } } }

Enums.cs

namespace AzureFunctionsBlogDemos.Shared { public class Enums { public enum MergeAlgorithms { QuickFind, QuickUnion, WeightedQuickUnion, WeightedQuickUnionWithPathCompression } } }

Run and Test MergeTrigger

Now we should be able to hit F5 and run our new MergeTrigger which will write our messages to our ServiceBus Topic.

Grab the URL and we will launch Postman to send a test locally. First, change the HTTP verb to "Post." Then, we change the "Content-Type" under "Headers" to "application/json".

*Note: * the two images below are reused from last week. Thus the URI is pointing to WeightedQuickUnionWithPathCompression.

Next, we click into the "Body" tab and switch to "raw".

Now, we should be able to send in a message to our MergeTrigger and see

Commit to Git

Now, we're ready to make a commit from Powershell which will trigger another deployment.

git add -A git commit -m "Added MergeTrigger." git push origin master

It should take a minute or so and then your function will be visible. Grab the URL just like we did yesterday. We will now be able to update the local address in Postman and send the same request to the Function hosted in Azure.

View Messages Waiting for Dequeue

Lastly, we can head on over to our ServiceBus Topic and see the various messages waiting for dequeue in their respective subscriptions.

Conclusion

Today we added a ServiceBus namespace and topic so that we can send in one message to our MergeTrigger HTTP Trigger and then have the output of that message queued to be picked up in my next post. By doing this we are keeping our services small and they're ideally only doing one thing. This sets us up to alter our WeightedQuickUnionWithPathCompression HTTP Trigger to pull the message from the subscription.

What's next: