A while ago we told you to quit worrying about SEO and optimise your website for your users, not machines.

While you want to develop your website with the end user (and not Google) in mind, there are things you can do to make it easier for search engines to find and return your pages for relevant search queries, helping your target audience to find you.

One way to do this is to ensure that search engines “understand” what they are indexing and this is down to how your developers are describing the information they are outputting onto their web pages.

In this article we will discuss the basics of semantic markup and give you the knowledge you need to discuss your semantic requirements with your development team.

What is semantic markup?

According to Wikipedia, the underlying concept of semantic markup is:

“…to make a distinction between the actual meaning of a document, and how this meaning is presented to its readers.”

“Languages” like CSS and Javascript are used to shape the look, feel and interactions of a website, whereas semantic markup adds the meaning that browsers and search engines need in order to understand and process the information on web pages.

“Rich snippets” as an example

How often do you come across search results with extra information, such as reviews for products you’re looking to buy, or preparation time for recipes, right there in the search results? This extra information is known as a “rich snippet”.

This search result for “pancake recipe” shows the searcher a photo, average rating, preparation time and calorie count. How did BBC Good Food get their search result to look like this?

Using Google’s interpretation of semantic markup to fully explain the recipe has made it possible for Google to understand the information on the page and pull it into search results.

Compare the BBC Good Food search result to this one for Sofeminine’s pancake recipe…

Which one would you be more likely to click on?

Google has Rich Snippets for describing various types of content and offers them in RDFa, Microformats and Microdata. If you are featuring any of the following information on your website or product it is worth taking a good look at the following:

We’ll take a look at what these formats actually are soon… But first, let’s talk about accessibility…

Accessibility

Screen readers, which translate text into speech for visually impaired web users, are better able to communicate web pages to users if the pages are semantically well structured. For example, a web page with Rich Snippet recipes will help browsers, search engines and accessibility software to understand that the page is for a recipe, with 9 ingredients that will take 30 minutes to prepare.

Now that we understand the importance of the semantic web, let’s take a look at a few ways that your developers can create your pages to be semantically correct…

Types of semantic markup

Standard HTML tags

The most basic form of semantic markup is the standard semantic HTML tags you are (hopefully) already using.

For example using <h1> tags lets search engines easily identify a top level title, the most important heading(s) on your page, so it can return your page for relevant search queries. A <table> tag lets search engines, browsers and accessibility software know that the information within the tag is tabular data (like a spreadsheet) and should be displayed or communicated accordingly.

You can find more in-depth information on semantic elements in HTML5 here.

While HTML can be used to specify a section of a web page and its structure, it still doesn’t explain exactly what the information actually means. For example, your title might be <h1> Batman </h1> . Search engines and browsers will be able to understand that this is a web page about Batman but they won’t know if you’re talking about the superhero in general, the TV series, a movie, Batman merchandise…etc. To fully describe the information on your web pages, you will need to use an additional languages…

Microformats, RDFa and Microdata

Aside from basic semantic HTML, there are 3 languages your developers can use to describe the context of your web pages to machines like search engines, browsers and accessibility software – Microformats, RDFa and Microdata.

Microformats

“Microformats are a collection of vocabularies for extending HTML with additional machine-readable semantics.” – HTML5doctor.com

Microformats are the longest established of the three languages, working seamlessly in a website’s existing HTML code.

Head over to Google Webmaster Tools to take a look at Microformats in action.

Resource Description Framework in Attributes (RDFa)

RDFa was developed to be a bit more flexible than Microformats. Where Microformats are predefined, with each Microformat dealing with a defined set of data (eg. the hCard Microformat dealing with data about people), RDFa allows developers to define custom vocabularies for properties on a web page.

“Whereas microformats specify both a syntax for embedding structured data into HTML documents and a vocabulary of specific terms for each microformat, RDFa specifies only a syntax and relies on independent specification of terms (often called vocabularies or taxonomies) by others. RDFa allows terms from multiple independently-developed vocabularies to be freely intermixed and is designed such that the language can be parsed without knowledge of the specific vocabulary being used.” – W3.org

Check out RDFa in action over at Google Webmaster Tools.

Microdata and Schema.org

Microdata is the newest of the three languages.

“Using attributes, we can define nestable groups of name-value pairs of data, called microdata, which are generally based on the page’s content. It gives us a whole new way to add extra semantic information and extend HTML5.” – html5doctor.com

In 2011, Google announced Schema.org, an initiative by Google, Microsoft and Yahoo! to standardise semantic markup. The idea is that a shared markup vocabulary “makes it easier for webmasters to decide on a markup schema and get maximum benefit from their efforts”. Schema.org uses Microdata and is understood by all search engines meaning webmasters do not have to worry about whether their markup is supported by specific search engines.

Since the announcement of Schema.org, Google has recommended using Microdata over Microformats and RDFa…

“Historically we’ve supported three different standards for structured data markup: microdata, microformats, and RDFa. We’ve decided to focus on just one format for schema.org to create a simpler story for webmasters and to improve consistency across search engines relying on the data. There are arguments to be made for preferring any of the existing standards, but we’ve found that microdata strikes a balance between the extensibility of RDFa and the simplicity of microformats, so this is the format that we’ve gone with. If you’ve already done markup on your pages using microformats or RDFa, we’ll continue to support it. One caveat to watch out for: while it’s OK to use the new schema.org markup or continue to use existing microformats or RDFa markup, you should avoid mixing the formats together on the same web page, as this can confuse our parsers.”

Summary

While you should create your web pages with the end user in mind, you need to make sure that the information on your pages can be read and understood by search engines, browsers and accessibility software. This will help people searching for your content to find it and improve the user experience of your pages for people with disabilities.

Although Google recommends using Microdata to add meaning to your pages, it also recognises Microformats and RDFa as suitable approaches. Consistency is key so make sure your development team picks a vocabulary and sticks to it.

Make sure you work with a development team you trust and who understand the importance of the semantic web. Not every developer will write semantically correct code without being asked, so make sure you discuss your requirements before getting started on your project, it will save you a lot of development time in the long run!