A Kaidez Article

HTML5, SEO and Microdata

TOPIC: HTML5

MUCH thanks to at Oli Studholme at HTML5 Doctor for helping me understand this!!!!

Update February 21, 2011: Oli looked at this post and suggested some code & semantics changes. Simply put, there are a lot of semantic mistakes in this article. The code below was changed as per his suggestions but the semantics were many. So many that it was easier to create a new post listing them instead of editing this article. Review the code below as it contains his edits, then read my post listing his semantic suggestions. – k

I’ve learned a few things about how HTML5 handles search engine optimization, or, SEO. The main thing I’ve learned is that we all need to fully understand microdata since Google uses it to collect detailed information about your web page.

While I’m still learning about microdata, I understand 95% of it…and let me be clear from the beginning about what I do understand:

  • Microdata’s main job is to provide extra information about your site to search engines and machine readers so they can better understand the site content.
  • Microdata MUST be written in 100% pure HTML5.
  • Microdata does not make page content more meaningful or more keyphrase-rich.
  • As with all SEO best practices, Microdata does not guarantee you a high Google site ranking.

Hope I was clear. Now let’s move on…

Often called “HTML5′s best kept secret,” Microdata, allows you to place a custom vocabulary of data onto your web page. “If the microdata uses a Google “rich snippet” vocabulary, it may also be used by Google”.

Let’s see it in action:

I recently created this test page with the following code:

<!--IMPORTANT POINT: On my About page, the code below is placed into
a <div> tag that's placed into another <div> tag which contains all the
page copy.  If my About page was a properly-formatted HTML5 page, the
copy would go into an <article> tag, the microdata would go into
an <aside> tag, and all of this would go into a <section> tag.-->

  <section itemscope itemtype="http://www.data-vocabulary.org/Person">

  <img itemprop="photo" class="me" width="80" height="80" src="http://en.gravatar.com/userimage/4528928/87cc8430c1f9a5c3b809cdde885f565a.jpg"  alt="[Kai Gittens, circa 2010]">

  <h1 class="entry-title">About Kai Gittens, AKA Kaidez</h1>
  <br />
  <h2>Posted by Kai Gittens  on January 24th</abbr></h2>
  <br />
  <h2 class="updated">
 

Let’s breakdown the code…

  • Note that the opening <section> tag has two attributes: itemscope and itemtype.
  • itemscope tells the browser that everything within the <section> tags is microdata and should be treated as such.
  • itemtype attaches the microdata to the “Person” vocabulary library stored at data-vocabulary.org.
  • For every piece of data within the <section> tag (name, address, etc.) an itemprop attribute needs to be applied to it. Look at the code and copy it like I did in terms of assign values.

All the data is then sent to Google and if it comes up in their search results, it will look like this:

Kaidez Microdata Screenshot

If you need more proof of this result, see what information comes back when my test page is plugged into Google’s Rich Snippets Testing Tool.

I’ve done a variety of Google searches trying to get this snippet to come up…no luck yet. But I’m confident that it will eventually and know that the microdata is still doing things behind the scenes.

Microdata isn’t really that new of a concept: it’s similar to existing technologies such as RDFa and microformats. But RDFa needs to be written in XHTML, which is headed for W3C deprecation; while microformats don’t really work without CSS, meaning you’ll have write extra code. Getting microdata to work requires writing non-deprecated HTML5 code and nothing else.

Speaking of microformats using CSS classes, here’s a quick FYI: placing the above-code into a Twenty Ten-themed WordPress page will still send you a positive result when placed into the Rich Snippet tool, but generate a warning saying that the author class is missing…see the results with the warning here. It’s due to the fact that Twenty Ten, which is HTML5-ready, uses a lot of the same CSS classes as the ones used by the ‘hatom’ feed format, which is similar to the RSS feed.

I plugged the code into this blog’s About Page and got that warning. Since this blog design is based on Twenty Ten and uses hatom classes like entry-title and entry-content, the presence of these classes is forcing the Snippet tool to look for hatom feed content in my About page. And as the lack of an author class makes the hatom data incomplete, the error shows up. I could fix this by putting a tag with a class named author somewhere on my post pages, but I’m happy with my design so I’m not going to do this.

If you want to get a feel of how much microdata is out there, check out the Operator plug-in for Firefox. It inserts a toolbar that detects microdata along with RDFa and Microformats, showing you what data is being collected and its potential use…especially for e-commerce. Plug in Operator, then do some general web surfing while paying attention to the toolbar…you’d be surprised what you’ll find out.

For further reading, the awesome HTML5 microdata tutorial at HTML Goodies and HTML5 Doctor’s microdata article are great starting points on the subject. After that, read what Mark Pilgrim has to say about microdata and definitely read Google’s microdata documentation.

Some other HTML5 SEO things…

  • Microdata only works if placed into a page with the bare minimum of HTML5 formatting. Read my tutorial on this.
  • If you try to hide microdata on your page (i.e. putting it into a page tag set to display:none), Google will totally ignore it.
  • Microdata must be placed in the main content of your web page; if you place it among header or footer content, Google will totally ignore it.
  • Bing and Yahoo! use microdata as well. To be fair, Bing was the first search engine to use it…go and read more about this.
    • Update (Jan 28, 2011): this may or may not be true…am in the middle of verifying this.
    • Update (Feb 02, 2011): Have to strikethrough this line…see this comment below.
  • “Person” is just one of nine vocabularies currently stored over at data-vocabulary.org. The complete list as of this post is: Person, Event, Organization, Product, Review, Review-aggregate, Breadcrumb, Offer and Offer-aggregate. Go and read about all of them.
  • Note that one of the vocabularies is named “Products.” So if you’re selling stuff online, microdata can help you.
  • If you want reviews of your products to appear in search results, take note of the “Review” and “Review-aggregate” vocabularies.
  • One of best uses of microdata is to create a well-designed digital business card, or a vCard. Many web designers use this tactic; see some of their work. Heck, I may do this some day!!!!
  • Even though I use the <section> tag in my example, don’t read too much into this. Google has adopted a “wait-and-see” policy in terms of applying HTML5 elements to their search algorithm. Many believe that the <article> tag will eventually get a lot of SEO weight since it’s the main spot for page content.
  • Microdata does not diminish the importance of the old rules of SEO. Commit the info in Google’s Webmaster Tools documentation to memory and remember that good, relevant content is always the best way to getting a good site rank. Also remember that the meta keywords tag is completely worthless, just like Google says it is.

In closing, remember that microdata works if used properly. So let’s all now take a blood oath and promise not to use it to create spam bait and ruin the party for everyone.

 
 
 

9 Responses to HTML5, SEO and Microdata

  1. PeterNo Gravatar says:

    Factual correction: I’m not aware that either Yahoo or Bing supports microdata. On the other hand, if you mark up your page using microformats or RDFa, you have a chance that it will work in all three search engines.

  2. kaidezNo Gravatar says:

    @Peter: thanks for stopping by!!!!

    I did read about other search engines supporting microdata over at this link, but to be fair, this was from a company soliciting its services and not an SEO authority. I’ve since updated this post to say that I’m checking it out.

    I may email the folks over at HTML5 Doctor and ask for clarification on this, but I’ve emailed them a few times about some other microdata questions I had. I don’t want them to be annoyed by my constant emails so I’ll email them next week but do research on my own until I hear from them.

    Again, thanks for stopping by!

  3. I’m very interested to know if you find out if Bing and Yahoo support rich snippets via Microdata. I am headlining an event on HTML5 Microdata March 31 in San Francisco and I am still gathering all the information for the presentation. Let us know what you find. I will cite you in my presentation.

    If anyone is in San Francisco on March 31 please attend the HTML5 User Group Event. You can RSVP here: http://www.sfhtml5.org/events/15930619/

  4. kaidezNo Gravatar says:

    Hey Michael! Thanks for stopping by!

    I just emailed HTML5 Doctor…hoping for a response shortly. Let’s stand by!!!

    Good luck with your event! Please let me know how it goes!

  5. kaidezNo Gravatar says:

    Peter may be right, guys. After doing some veeeery thorough research, I cannot find a viable source to confirm that Bing and/or Yahoo support microdata.

    Along with emailing HTML5 Doctor and getting no response, I also asked this question on Twitter. I got a response from Klemen Slavic who created a jQuery script for testing and browsing microdata. He doesn’t think that Bing uses it, but says that it doesn’t hurt to use it, which is 100% true. Because of the jQuery code he wrote, I view him as a viable source.

    Although HTML5 Doctor didn’t get back to me, they do say here that the big three search engines (Google, Yahoo & Bing) haven’t provided a lot of information on how/if they use microdata. And while Google hasn’t told them everything about microdata, they have confirmed that they use it in this article.

    All this being said, I have to do a mea culpa and have put a strike-through on the Bing/Yahoo line-item in this article…sorry for the inconvenience.

    -k

  6. Thanks for your investigation. I will specify that Google is the only search engine who says they support Microdata.

    One thing to note is that I’ve heard Bing is stealing Google’s search results. Check out this article from Search Engine Land: http://searchengineland.com/google-bing-is-cheating-copying-our-search-results-62914

    Because Bing is Yahoo and Bing is stealing from Google, Bing might very well use Microdata through Google. That is all.

    I will try and livestream the event I give on Microdata, so I’ll keep you all informed.

  7. kaidezNo Gravatar says:

    I saw that a wondered if Bing was tapping into microdata!!!!! And the irony of the whole situation was that Microsoft didn’t even really deny it!!!!!

    Thanks for the link!!! Let me know about the live stream!!

  8. DuccioNo Gravatar says:

    Hi, can I use custom vocabulary? Like stuff that I have on my site and is different from : Person, Event, Organization, Product, Review, Review-aggregate, Breadcrumb, Offer and Offer-aggregate

  9. kaidezNo Gravatar says:

    Hi Duccio. Thanks for stopping by!

    I think that you need to stick to the spec for custom vocabulary listed at Data-Vocabulary.org.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="" highlight="">