Over a million developers have joined DZone.

Reading Microdata Elements in Chrome

DZone's Guide to

Reading Microdata Elements in Chrome

· Web Dev Zone
Free Resource

Discover how to focus on operators for Reactive Programming and how they are essential to react to data in your application.  Brought to you in partnership with Wakanda

Before going any further, please note this blog post definitely falls into the "questionable" category. Please read the following with a large grain of salt (and a cold beer at your side). I've read a few articles recently on microdata. Today I read another good one here: Make Your Page Consumable by Robots and Humans Alike With Microdata.

The concept is rather simple. By embedding a bit of metadata into your code, you make your pages have machine-readable context. This is a bit like data attributes, but in my mind a bit different. Data attributes are, in my opinion, useful for data in a self-contained manner. Ie, you mark up your pages so your code (JavaScript or CSS) can do something with it. Microdata is for external consumers. Mixed with external schemas this could be pretty powerful. Apparently Google is already using this so it has some SEO value as well.

I got even more interested when I saw there was a DOM API for it: document.getItems(). This would, supposedly, return all the microdata items in your current document. Unfortunately, this failed in Chrome. Surprisingly, CanIUse.com failed to report on the API and I had to dig a bit more to find that - apparently - only Firefox and Opera support this API at the moment.

I wanted to build something that would a) notice if microdata was in use and b) report how it was used. I knew I could get, and iterate, over all the items in the DOM but I assumed that would be rather wasteful. Then I discovered the document.evaluate function. This allows you to use XPath to search the DOM. So with that at my disposal, I first created a function that would check for the existence of any microdata in use:

function hasItems() {
    return document.evaluate("count(/html/body//*[@itemscope])", document, null, XPathResult.NUMBER_TYPE, null).numberValue > 0;

If you didn't read the article I linked to before, the use of a itemscope as an attribute "wraps" DOM items that are considered one logical unit of microdata. My XPath simply looks for this and runs a count() operation to get the number of items that match.

I then wrote a function that would return these items. For the most part, this is a simple matter of iterating over XPath results and using DOM functions to get values, but you have to use a bit of logic based on what type of DOM node you're dealing with. So for example, if an Anchor tag is used for a property, then the microdata value is sourced by the href attribute. For most other things you simply use the inner text. Here's my getItems function (and yes, that name is too generic):

function getItems() {
    var items = document.evaluate("/html/body//*[@itemscope]", document, null, XPathResult.ANY_TYPE, null); 
    var results = [];
    var result = items.iterateNext();
    while(result) {
        var kids = document.evaluate(".//*[@itemprop]", result, null, XPathResult.ANY_TYPE, null); 
        var item = {};
        var kidprop = kids.iterateNext();
        while(kidprop) {
            var attr = kidprop.attributes.getNamedItem("itemprop");
            //To get the value, it depends on the type
            var value="";
            switch(kidprop.nodeName) {
                case "AREA":
                case "LINK":
                case "A":
                    value = kidprop.href;

                case "AUDIO":
                case "EMBED":
                case "IFRAME":
                case "IMG":
                case "SOURCE":
                case "VIDEO":
                    value = kidprop.src;

                    value = kidprop.innerText;
            item[attr.nodeValue] = value;
            kidprop = kids.iterateNext();

        result = items.iterateNext();
    return results;

I used some source HTML based on the article I linked to earlier:

<li itemscope>
        <li>Name: <span style="foo" itemprop="name2">Fred</span></li>
        <li>Name: <span itemprop="name">Fred</span></li>
        <li>Phone: <span itemprop="telephone">210-555-5555</span></li>
        <li>Email: <span itemprop="email">thebuffalo@rockandstone.com</span></li>
        <li>Site: <a href="foo.html" itemprop="url">My site</a></li>
<li itemscope>
        <li>Name: <span itemprop="name">Wilma</span></li>
        <li>Phone: <span itemprop="telephone">210-555-7777</span></li>
        <li>Email: <span itemprop="email">thewife@rockandstone.com</span></li>
<li itemscope>
        <li>Name: <span itemprop="name">Betty</span></li>
        <li>Phone: <span itemprop="telephone">210-555-8888</span></li>
        <li>Email: <span itemprop="email">theneighbour@rockandstone.com</span></li>
<li itemscope>
        <li>Name: <span itemprop="name">Barny</span></li>
        <li>Phone: <span itemprop="telephone">210-555-0000</span></li>
        <li>Email: <span itemprop="email">thebestfriend@rockandstone.com</span></li>

When I execute my JavaScript against this, I get:

Useful? Not sure yet. I assume, eventually, Chrome will get the native API anyway. (Although in Firefox it returns the Node items, not a nice array like I've got, unless I'm using it wrong it looks like there may still be a need for a utility function.)

Learn how divergent branches can appear in your repository and how to better understand why they are called “branches".  Brought to you in partnership with Wakanda


Published at DZone with permission of Raymond Camden, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.


Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.


{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}