Introduction SEOs have been getting into our industry from all sorts of past careers — web designers, developers, marketers, business people and those that “just fell into SEO”. Some of these past positions may have required data analysis with Microsoft Excel, but a good many of them did not. Excel was not a big part of my past jobs, and I would guess that many SEOs’ past careers did not require anything more than adequacy in the program. Over the last few years our field has become even more data-driven than in the past thanks to tools like Open Site Explorer, improved Google Webmaster Tools and Analytics, Majestic SEO, Raven, and many others. Additionally, our clients have become wiser and more SEO capable, having been burned in the past by snake oil SEOs. They want reliable SEO advice with a strong verifiable foundation, and who could blame them? Many SEOs are now finding themselves faced with the task of doing fairly complex data analysis to improve their search strategies, and Excel adequacy is not quite enough. This was the position I found myself in not too long ago, and while I’m far from the deadliest ninja in the dojo today, I’ve picked up a thing or two from the brilliant minds around me. With this document I intend to share some of the most valuable a spects of Microsoft Excel for the SEO. It is far from an exhaustive look at everything that can be done with Excel, but hopefully a strong foundation for the SEO’s toolkit. I’ll be including real world SEO tasks, ranging from the relatively simple to rather complex, so I hope there’s something for everyone. So if pivot tables, IF statements, absolute references and nested functions make you scratch your head, read on, Aspiring Ninja! Oh! Before we begin, we must give credit where credit is due. This guide could not have been prepared without the help of some of Richard Baxter‘s awesome blog posts and the official Microsoft Excel Help site. Lesson 1: Basic Tasks In this lesson we’ll cover some of the simpler functions available in Excel, and how they’re used in the SEO’s day-to-day tasks. The functions we’ll cover: CONCATENATE

Text to Columns

COUNTIF

IFERROR Concatenate Microsoft Excel definition: Joins several text strings into one text string. Syntax: CONCATENATE(text1,text2,…) Concatenate is a pretty self-explanatory function, but that doesn’t make it any less useful. It is most often used to combine two cells into one, as in: You may also use the formula to insert text strings before, after, or between other cells. Insert a text string by putting it within quotations, as seen here: Text to Columns Microsoft Excel definition: Distribute the contents of one cell across separate columns Text to Column’s functionality is a bit limited, but I find myself using it almost every time I open Excel. As an SEO, this is the go-to function to separate subfolders or divide up subdomains, root domains, and/or TLD. Unlike CONCATENATE, we won’t need a formula to carry out this function. Seen here in Windows Excel 2007 Seen here in Mac Excel 2011 For our real-world SEO example, let’s take an OpenSiteExplorer.org Top Pages report and suppose that we want to find which subfolders receive the most links. The “http://” is not necessary. This can be removed with a find and replace now, or dealt with afterward. Select “Delimited” Choose “Other” and type the “/” key Format cells (not necessary for this example) and select a destination (default destination is usually fine) Voila! We can now manipulate this data however we see fit. This is great for eCommerce sites with a nice URL structure of /category/subcategory/product/. COUNTIF Microsoft Excel Definition: Counts the number of cells within a range that meet the given criteria. Syntax: COUNTIF(range,criteria) COUNTIF is your go-to function for getting a count of the number of instances of a particular string. For instance, anchor text: Looking to get the count of empty anchor text instances? COUNTIF does the trick. IFERROR Microsoft Excel Definition: Returns a value that you specify if a formula evaluates to an error; otherwise, it returns the result of the formula. Use IFERROR to trap and handle errors in a formula. Syntax: IFERROR(value,value_if_error) IFERROR is really simple and will become an important piece of most of our formulas as things get more complex. IFERROR is your method to turn those pesky #N/A, #VALUE or #DIV/0 messages into something a bit more presentable. Divide by Zero? No Problem! Thus concludes Lesson 1 of your Excel for SEO training. Congratulations, this makes you an orange belt. Here’s an orange belt that you can print, cut out, and wear around town.

Keep Your Data Organized With Tables! Turning your range into a table will give you banded rows for easier readability: Your data. Table-ified. Automatically calculated new columns AND structured/named references: Ooooooh. How intuitive! Now we just hit enter and our column is created and calculated. Table headers that remain atop data when scrolling: Reference tables by name in other formulas: To convert your range to a table use CTRL+T on Mac OS X or CTRL+L on a PC.

Lesson 2: More Functions — Text Manipulation The functions on which we’ll be focusing in this lesson are useful for dealing with text manipulation. As we’ll see from the examples, there are quite a few scenarios wherein the SEO has to manipulate a text string. Some of the formulas we’ll talk about are pretty simple to grasp individually, but can get a bit confusing when used together. We’ll touch on: LEN

SEARCH/FIND

LEFT, RIGHT, MID LEN Microsoft Excel Definition: Returns the number of characters in a text string. Syntax: LEN(text) I doubt this requires much explanation. LEN alone is fairly useless. Sorry LEN. SEARCH/FIND Microsoft Excel Definition:

SEARCH — Returns the number of the character at which a specific character or text string is first found, reading left to right (not case-sensitive). FIND — Returns the starting position of one text string within another text string. FIND is case-sensitive. Syntax: SEARCH(find_text,within_text,start_num) and FIND(find_text,within_text,start_num) There are two differences between SEARCH and FIND. SEARCH is not case-sensitive, FIND is. SEARCH allows the use of wildcards, FIND does not. Under most circumstances, SEARCH is all you need, but it helps to know that FIND is always there if you’ve got to deal with pesky capital letters in URLs or something similar. Another reason to choose FIND is if you’re dealing with URLs that contain parameters. Without properly escaping question marks, they will act as wild cards, which may cause some frustration. In our example below, we’ve pulled out the character number at which the “/blog/” string begins. Much like LEN, this function is a bit silly on its own, but can be combined with some of our other functions to do some cool things. Now Class, we remember what we do with those #VALUE!s, don’t we? That’s right! Wrap an IFERROR around the formula! Nested Formulas — Don’t Be Scared! Also of note in the 2nd example above: This is the first time we’ve used what’s called a nested formula. We have these when a function is placed within another function, which can be placed in another function, and another, and so on. The more complex the nested formula becomes, the easier it becomes to break down. Whether you’re reviewing your own formulas for errors, or looking at someone else’s work, you should start with the middle of a nested formula and work your way out. Additionally, the F9 key is your friend! Trying to debug a formula that keeps breaking? To see the nested interior formula’s results, highlight and hit F9 Once you’re satisfied, hit ESC, otherwise the calculated result will remain. LEFT, RIGHT, MID Microsoft Excel Definition:

LEFT — Returns the specific number of characters from the start of a text string. RIGHT — Returns the specific number of characters from the end of a text string. MID — Returns the characters from the middle of a text string, given a starting position and length. Syntax: LEFT(text,num_chars) RIGHT(text,num_chars) MID(text,start_num,num_chars) Both LEFT and RIGHT return the characters from a given position in a text string starting from either side of a string. MID is great for extracting a portion of a text string. I’ve lumped the three together because they are often used in conjunction with each other (along with a few of the earlier functions). Let’s dive into an example: Bringing It All Together – Example 1 Let’s say we’ve been given a list of URLs, and we want to extract just the domain. This formula will do the job. Let’s break down this nested formula, and see how it pulls just the domain out of our URL. Starting from the middle we see SEARCH, which uses the syntax: SEARCH(find_text,within_text,start_num) In plain terms, this formula finds the first instance of “/” in the cell to the left, starting at the 8th character from the beginning, which is done to start past the double slash in http://. As we see below, the result for the first row of data is 22. The same formula with the inner function calculated Now we are left with a simple LEFT formula. The syntax for LEFT is LEFT(text,num_chars). In plain terms: Give us the first 22 characters starting from the beginning. We now have a nice listing of just root domains. Our list of root domains. The formula reflects the change to a table format from the simple range used previously. Example 2 Let’s use SEARCH (with wildcards) and MID together to extract a portion of a URL: Let’s assume we want to pull the descriptive piece out of each of these URLs for reporting purposes We’ll definitely be making use of MID, as the text we want is in the MIDdle of our string. We’ll need to determine how many characters make up the “-tXXX.html” bit at the end of each URL. Since the length of this portion of the URL varies, but the format doesn’t (that is, “-t” + “numbers” + “.html”), we can use wildcards to find this character count. Again, the syntaxes for these 2 functions: MID(text,start_num,num_chars) SEARCH(find_text,within_text,start_num) Let’s break down the formula for the first URL in our list. Cell A2: http://www.example.com/lamp-maintenance-t83.html =MID(A2,SEARCH(“/”,A2,8),SEARCH(“-t*.html”,A2)-SEARCH(“/”,A2,8)) =MID(A2,23, SEARCH(“-t*.html”,A2)-23) We’ve calculated the first instance of a “/” after the 8th character. This gives us our start_num values. We’re also using the * wildcard to help us get the character count of the right-most chunk of text. =MID(A2,23,SEARCH(“-t*.html”,A2)-23) =MID(A2,23,40-23) We can easily calculate the number of characters for our MID once we know where our non-descriptive characters begin. =MID(A2,23,17) /lamp-maintenance Hooray! Example 2.5 Let’s make a small adjustment to our original URL to demonstrate how we can use LEN in this formula. Cell A2: http://www.example.com/t1521-lamp-maintenance.html =MID(A2,SEARCH(“-”,A2)+1,LEN(A2)-SEARCH(“-”,A2)-5) =MID(A2,29+1,50-29-5) /lamp-maintenance The additional +1 and -5 are necessary to make minor adjustments to the final outcome. Without them, our final result would have been “-lamp-maintenance.html”. This completes lesson 2. If you’ve made it this far, you’re fit to carry these around. Be careful though, you’ll take someone’s eye out!

Save Some Time With This Pro Tip! Sometimes you want to grab a range of calculated data and place it elsewhere. However, if that data relies on a formula in place it will fall apart once pasted. In this case, you’d like to copy just the values created by these formulae. The slow way is to copy your data, and paste as values using Excel’s ‘Paste Special’. Here’s an even quicker way: Highlight your range of data and hover over the black line until you see the above cursor. Now, right click and drag to your desired destination. You can actually hover away from the source, then back over it to replace your formulas, or place it elsewhere. Release your right click, select ‘Copy Here as Values Only’ and you’re done!

Lesson 3: IF, OR/AND Let’s slow things down a bit and learn some new helper functions. These guys aren’t going to uncover any gold data nuggets on their own, but they can help to organize and present data in your tables, pivot tables and charts. Let’s start with a list of internal URLs, perhaps from the top content report in Google Analytics. We’d like to find out which type of page is seeing the most traffic: the homepage (/), the blog home (/blog), blog posts (/blog/{post}), category page (/blog/{category}/), a services page (there are 4), or other. Our raw data Ultimately, there are probably a ton of different ways to go about getting this information, but almost every method will involve IF. IF Microsoft Excel Definition: Checks whether a condition is met, and returns one value if TRUE, and another value if FALSE. Syntax: IF(logical_test,value_if_true,value_if_false) IF by itself is quite simple. It can get a bit confusing when nested (see below) and when combined with other functions (later). The value_if_false in each of these IF statements has been replaced with another IF statement. Back to our Example Using a nested IF, we should be able to create a “Page Type” column in our spreadsheet and apply it to our URLs. We’ll take it one step at a time. Let’s determine if the URL is for the blog or the home page, since those two URLs are quite simple to identify and only occur once. Since the URL for the homepage is “/” and the blog home is “/blog/”, this formula will work: Note:The table in the following formulas has been named TC, which is reflected in the prefix before the cell references =IF(TC[[#This Row],[Page]]=”/”,”Home”,IF(TC[[#This Row],[Page]]=”/blog/”,”Blog Home”,”Other”)) Now let’s find some way to classify a URL as a blog post or a blog category page. All of the blog post URLs on the Distilled domain follow this format: /blog/{category-name}/{optional-sub-category}/{post-name}/ And all of the blog categories follow this format: /blog/category/{category-name}/{optional-sub-category}/ So we’ll need to add to our growing IF formula. In plain English, our formula must check to see if the URL starts with /blog/category/, in which case it is a blog category page. If not, we’ll get less specific and check to see if it starts with /blog/, in which case it is a blog post. Our forumula now looks like this: =IF(TC[[#This Row],[Page]]="/","Home",IF(TC[[#This Row],[Page]]="/blog/","Blog Home",IF(ISNUMBER(SEARCH("/blog/category/",TC[[#This Row],[Page]])),"Blog Category",IF(ISNUMBER(SEARCH("/blog/",TC[[#This Row],[Page]])),"Blog Post","Other")))) In the above, after Excel has checked that the given URL isn’t the home page or blog home, it does a SEARCH for “/blog/category/”. We then make use of the ISNUMBER function to see if that inner SEARCH function is returning a digit or a #VALUE (#VALUE being the result if the string is not found). ISNUMBER simply checks to see if the given value is a digit, and returns TRUE or FALSE. Almost there! For our final piece of the equation we want to classify 4 of our URLs as “Services”. Those 4 URLs are /pay-per-click.html, /online-reputation.html, /search-engine-optimisation.html, and /web-design.html. If we can do this properly, every other URL that hasn’t been classified will become “Other” and our work will be done! We’ll be adding the OR function to our formula now, so a brief intro, including AND: OR/AND Microsoft Excel Definition: OR: Checks whether any of the arguments are TRUE, and returns TRUE or FALSE. Returns FALSE only if all arguments are FALSE. AND: Checks whether all arguments are TRUE, and returns TRUE if all arguments are TRUE. Syntax:

OR(logical1,logical2,…) AND(logical1,logical2,…) In the context of our example, neither OR nor AND are completely necessary, but helpful in keeping our formula under control. Without these functions we could certainly add 4 separate IF(ISNUMBER(SEARCH’s for each URL, but our formula is getting long enough already. Using OR we can simplify slightly while also keeping the formula a bit closer to how we would verbally communicate the formula’s steps. Back to our Example Now, we’ll pick up where we left off by replacing the value_if_false of the last if statement with the next step. The new part of our formula looks like this: IF(OR(ISNUMBER(SEARCH("/web-design.html",TC[[#This Row],[Page]])),ISNUMBER(SEARCH("/search-engine-optimisation.html",TC[[#This Row],[Page]])),ISNUMBER(SEARCH("/online-reputation.html",TC[[#This Row],[Page]])),ISNUMBER(SEARCH("/pay-per-click.html",TC[[#This Row],[Page]]))),"Services","Other") As is hopefully evident, the four ISNUMBER(SEARCH’s serve as the logical1, logical2, logical3, and logical4 of the OR function. As the OR function goes, if any of those logical arguments are true, the entire statement returns TRUE, and the IF statement returns the value_if_true value. I won’t be giving a specific AND example, but hopefully its utility is fairly obvious. Our mini-project is done! We’ll discuss how to go about getting the totals for each page classification in our pivot table lesson. This lesson is complete, congratulations. These ninjas stars will go well with your new IF statement abilities!

Killer Keyboard Shortcuts! As an Excel ninja, sometimes you really should just leave the mouse out of it. Besides, there’s no such thing as a slow ninja. That’s where keyboard shortcuts come in. Below are some of the most useful shortcuts for the Excel ninja. Note: These shortcuts are based on the Windows Excel 2007 version, and may (likely) be different for Mac and/or Excel 2010. Shortcut Function F2 Edit the selected field F4 (When editing a field) Toggle between absolute and relative references (i.e. A1,$A$1,A$1,$A1) F9 Debug/evaluate highlighted piece of formula ALT,A,R,A Refresh all — Most useful when changing source data of pivot table ALT,H,D,L Delete table row CTRL+PageUp/Down Move between work sheets CTRL+SHIFT+* Select the current region around the active cell — Great for selecting a whole table when only one cell within the table is selected SHIFT+F3 In a formula, display the Insert Function dialog box CTRL+HYPHEN Delete selected cells CTRL+SPACEBAR Select the entire column SHIFT+SPACEBAR Select the entire row CTRL+L Create a table ALT,H,V,V Paste as values

Lesson 4: VLOOKUP, OFFSET, INDEX/MATCH Here in lesson 4 we’ll be talking about some lookup and reference functions. These guys are great for matching up data between two sources, for example, combining keyword volume (data from the Google Keyword Tool) with current rankings (data from your rank checker of choice). In fact, this is the exact example we’ll use to demonstrate our first function… VLOOKUP Microsoft Excel Defininition: Looks for a value in the leftmost column of a table, and then returns a value in the same row from a column you specify. By default, the table must be sorted in an ascending order. SYNTAX: VLOOKUP(lookup_value,table_array,col_index_num,range_lookup) Combining search volume and current ranking of a keyword list is a great way to prioritize upcoming SEO efforts, and can be done at any stage in a website’s life. VLOOKUP makes this a painless process. With our targeted keyword list in hand, we head over to the Google keyword tool to grab search volumes. After cleaning up the output from the GKT, we’ve got our list of keywords and search volume Next we’ll run rankings for our site using whichever ranking tool we choose, and export that into Excel. What’s that they say about a company not eating their own brand of dog food? Yeesh. After turning our ranges into tables named “Volume” and “Rank” we can combine our two datasets into one pretty table with VLOOKUP. In our “Volume” table, we’ve created a new column and used the following formula to pull in the appropriate data from the “Rank” table. SYNTAX: VLOOKUP(lookup_value,table_array,col_index_num,[range_lookup]) =VLOOKUP(Volume[[#This Row],[Keyword]],Rank,2,FALSE) In plain English: Find this exact (range_lookup) keyword (lookup_value) in the “Rank” table (table_array), and return what is in the second column (col_index_num) of that “Rank” table. Remember, in that “Rank” table, the actual rank is in the 2nd column. The [range_lookup] portion of the VLOOKUP function is an optional value that specifies whether we want to find an exact match or an approximate match. If range_lookup is TRUE or omitted (for an approximate match), the values in the first column of table_array must be sorted in ascending order. If range_lookup is FALSE (for an exact match), the table_array does not need to be sorted. In my personal SEO + Excel history, I’ve never set range_lookup to TRUE. In other words, I’m really only ever looking for an exact match. This simplifies things when it comes to that nasty ascending sort order requirement. If you see a bunch of #N/As when using VLOOKUP, check to see if your range_lookup is either omitted or set to TRUE. OFFESET This function is another doozy that I end up using over and over for one specific SEO task – Analyzing compared date ranges from a Google Analytics CSV export. That’s not to say this function is a one trick pony; I’d be surprised if you don’t find a need for it once it’s in your toolbox. Microsoft Excel Definition: Returns a reference to a range that is a given number of rows and columns from a given reference. SYNTAX: OFFSET(reference,rows,columns,[height],[width]) After cleaning out the extraneous data we have our compared date range data from Google Analytics. In particular, we have referring keyword traffic from one week compared to the previous. Now, if we’re just comparing these 4 (gibberish) keywords we’d be ok, but since we’re actually looking at thousands of keywords it won’t be easy to see through the clutter. Using OFFSET we can turn the above, into this: Getting our first row of ‘Keyword’, ‘Week 1′, and ‘Week 2′ is simple enough, but how can we get Excel to populate the rest? Let’s take a look at the formula: Let’s break the formula down piece by piece, starting with how we’re relocating our ‘Keyword’ column. SYNTAX: OFFSET(reference,row,columns,[height],[width]) =OFFSET($A$2,(ROW($A3)-2)*4,0) =OFFSET($A$2,(3-2)*4,0) ROW is a simple formula that just returns the number of the row a cell happens to be. You’ll notice that only the column (A) is set as an absolute reference. We want the row to rise as we move down each row. =OFFSET($A$2,4,0) UNMATRIMONIALLY HEMATOLOGIC EVENNESS In our example, OFFSET outputs the cell that comes 4 rows beyond the reference point. =OFFSET($A$2,(ROW($A4)-2)*4,0) =OFFSET($A$2,8,0) SUPERSERAPHICALLY UNCALIBRATED ITINERARY The next time the row is calculated, our references will incrementally rise, allowing the formula to adjust according to our raw data. With some fancy math, we’ve been able to make the row value jump 4 each time the formula moves to the next row. For our ‘Week 1′ and ‘Week 2′ columns, the formula has changed slightly, but the basic premise remains the same. INDEX/MATCH INDEX/MATCH is like a more powerful version of VLOOKUP, but it is a bit harder to wrap one’s head around. In most cases they are interchangeable, but where VLOOKUP falls short, INDEX/MATCH gets the job done. VLOOKUP’s Shortcomings The biggest issue with VLOOKUP lies in the first few words of the Microsoft Excel definition: Looks for a value in the leftmost column of a table, and then returns a value in the same row from a column you specify. In instances where the key field is not in the left-most field, VLOOKUP will not work. Let’s say we have a list of the referring keywords of the past 90 days from Google Analytics. For budgeting and reporting purposes, we’ve classified each keyword as branded, head, mid, or long-tail in a “Category” column in the left-most position of our table. If we were attempting to budget traffic for the next 90 days on another sheet, VLOOKUP would disappoint: Now, we could certainly go back and edit our source table to make VLOOKUP work by moving our keyword column to the left-most column, but INDEX/MATCH is really all we need. Let’s first cover what INDEX and MATCH actually do when used alone. Microsoft Excel Defintion: INDEX: Returns a value or reference of the cell at the intersection of a particular row and column, in a given range. MATCH: Returns the relative position of an item in an array that matches a specificied value in a specified order. SYNTAX: INDEX(reference,row_num,column_num,[area_num]) MATCH(lookup_value,lookup_array,match_type) Both of these functions are actually quite simple alone, but do something really powerful when used together. The beauty of Excel! With INDEX, Excel simply returns the value that is found at the intersection of the row and column you’ve supplied. For example: Return the value at the intersection of the 3rd row and the 4th column. The value returned is “16″. With MATCH, we can find the relative row that matches our lookup_value. For example: Find the row in which “14″ is found within this single colum. The value of “3″ is returned. Moving back to our earlier example, we can combine INDEX and MATCH to pick up the slack where VLOOKUP fails. In plain English: Return the value found at the intersection of the 4th column and the row where the given lookup_value can be found. For the first row of our table: Find “seo” in the Keyword column of the Keyword_Table and report its row (this is the MATCH portion). Now, return the value found in the 4th column of that table’s row (this is the INDEX portion). There you have it! Sure, VLOOKUP gets the job done 90% of the time, but INDEX/MATCH does 100% of the time, so why not master the one that works all the time! You’re getting Oh So Close to becoming a certified Excel Ninja! For completing this lesson, you’ve been awarded this completely awesome ninja sword! Why Leave Excel When We Have Excellent Analytics!? If you’re doing some analysis in Microsoft Excel that calls for some Analytics data you can easily open up your browser, login to Google Analytics, create the report you’d like, export that report into a CSV, copy that data from the newly created spreadsheet, and finally paste into a new sheet in your current workbook… Oops, hmm, that wasn’t really all that easy, was it? There’s Got to Be a Better Way! Well there is, friends, and it’s called Excellent Analytics (http://excellentanalytics.com). With Excellent Analytics, you can automatically import Analytics data into your current spreadsheet without leaving Excel. There are a few quirks, but once you’ve gotten the hang of using the Excel plugin, you’ll save tons of time. Unfortunately, at the time of this report’s creation, Excellent Analytics only works with Excel 2007 for Windows.