Performing SEO on AngularJS Web Apps - Developer’s Guide
We certainly have many old-fashioned ways to embed full SEO support for Angular JS, but many consider JavaScript SEO as a friendly option. This is because it uses special URL routing and creates headless browser settings to automatically retrieve the HTML. Read on to learn more.
Join the DZone community and get the full member experience.
Join For FreeGoogle and other search engines are now getting smarter day by day and hence understanding webpages in a better way. Bot-Crawling of JavaScript is no longer a major issue which is why more web apps are being indexed by search engines.
I know this is great news for webmasters, however, Google advises otherwise, explaining:
"It is always a good idea to degrade your site gracefully, which will help users to enjoy your content even if they do not have their compatible JavaScript implementations. This will also help visitors with JavaScript disabled or off, as well as the search engines that can’t execute JavaScript."
We certainly have many old-fashioned ways to embed full SEO support for Angular JS, but many consider JavaScript SEO as a friendly option. This is because it uses special URL routing and creates headless browser settings to automatically retrieve the HTML.
A - AngularJS Apps Indexing:
You must know that Google indexes your content automatically. Therefore, you can tweak your content rendering properties where Google Bots index your content completely the way you want. You can accomplish this by serving your AngularJS content through Custom Backend Server.
B - Modern Search Engines and Client-side Apps URL:
Here comes the concept of hashbang. For easing the job of indexing web-app content, Google and other search engines have the feature of hashbanging URL format. Whenever the search engine finds a hashbang URL, i.e., a URL containing #!’ it automatically converts it into ?_escaped_fragment_= URL where it would find full rendered HTML content ready to be indexed.
Let me give you an example for better understanding:
Google will turn the hashbang URL from:
http://www.example.com/#!/page/content
Into the URL:
http://www.example.com/?_escaped_fragment_=/page/content
The second URL, which is not originally displayed to the website visitors, the search engine will find non-JS content which would be easy to index.
Now, let’s make your application intelligent enough so that when search engine bot queries the second URL, the server should return the necessary HTML snapshots of the page. This requires some rewriting for your application.
RewriteEngine On
RewriteCond %{REQUEST_URI} ^/$
RewriteCond %{QUERY_STRING} ^_escaped_fragment_=/?(.*)$
RewriteRule ^(.*)$ /snapshots/%1? [NC,L]
A special snapshot directory as a redirect URL has been made. This directory will therefore contain your HTML snapshots of your corresponding app pages. Also, you can set up your own directory and make changes accordingly.
You may face another problem which is instructing AngularJS to use hashbangs. By default AngularJS churns out URL’s with only # instead of #!. In order to get the latter you just have to add the below mentioned module.
angular.module('HashBangURLs', []).config(['$locationProvider', function($location) {
$location.hashPrefix('!');
}]);
C - Creating HTML5 Routing Modes Instead of Hashbangs:
How can I forget telling you that HTML5 is awesome? Along with the Hashbang technique, HTML5 and AngularJS combination gives us one more hack to trick search engines into parsing ?_escaped_fragment_ URLs. This actually uses Hashbang URLs.
Step 1:
First you have to instruct Google that we are actually using AJAX content, and the bot should visit the same URL using _escaped_fragment_ syntax. Do this by including the following meta in your HTML code.
<meta name="fragment" content="!">
Step 2:
Now you will need to configure AngularJS so that it uses HTML5 URLs whenever and wherever it has to handle URLs and routing. Add the following AngularJS module to your code for achieving this.
angular.module('HTML5ModeURLs', []).config(['$routeProvider', function($route)
{
$route.html5Mode(true);
}]);
D - Handling SEO From the Server-side Using ExpressJS :
We all know the awesomeness of ExpressJS. Well we can also use ExpressJS for our server-side rerouting instead of Apache.
In order to make your ExpressJS framework deliver static HTML, you will first need to set up middleware that will look for _escaped_fragment_ in your input URLs. Once found it will instantly serve HTML snapshots.
// In our app.js configuration
app.use(function(req, res, next)
{
var fragment = req.query._escaped_fragment_;
// If there is no fragment in the query params
// then we're not serving a crawler
if (!fragment) return next();
// If the fragment is empty, serve the
// index page
if (fragment === "" || fragment === "/")
fragment = "/index.html";
// If fragment does not start with '/'
// prepend it to our fragment
if (fragment.charAt(0) !== "/")
fragment = '/' + fragment;
// If fragment does not end with '.html'
// append it to the fragment
if (fragment.indexOf('.html') == -1) fragment += ".html";
// Serve the static html snapshot
try {
var file = __dirname + "/snapshots" + fragment; res.sendfile(file);
} catch (err) { res.send(404);}
});
Again we have to setup snapshots in a top level directory named ‘/snapshot’. ExpressJS, after taking into account the possibility that search-engine-bot renders URL, does not have simple syntax features such as "/" or ".html", hence provide the correct part to the bot.
E - Taking Snapshots Using Node.JS
The most used tools for taking HTML snapshots of your web app are Zombie.JS and Phantom.JS.
PhantomJS and ZombieJS create a headless browser which can access the regular URL of your web-app page. Also, they can grab the rendered HTML content when it is fully executed and then return the final HTML in a temporary file.
Various online resources are available for reference such as:
Let’s not go into too much detail on this. However, I would certainly like to highlight an open-source tool, Prerender.IO which can be used to take HTML screenshots. An even easier resort is a tool called Grunt-html-snapshot, and you would be surprised to know that you will find it in Node.JS.
NodeJS comes pre-packed with Grunt tool which you can easily use to create your own screen-shots hassle free. Let’s have a look of few steps to setup grunt tool and start churning out HTML:
First install NodeJS. You can download it from http://nodejs.org. Along with node also install Npm (node package manager). Mac and Windows users can access NodeJS with click and install applications. Ubuntu users can extract the tar.gz file and then install it from the command terminal. The latest Ubuntu users can also install using sudo apt-get install nodejs nodejs-dev npm command. It should be noted that Npm comes well-equipped with Grunt.
Open your command console and navigate to your project folder.
Run command (for installing Grunt tool globally): npm install -g grunt-cli
The same can be done by installing a local copy of Grunt. Its essential HTML-snapshot feature using the command npm install grunt-html-snapshot –save-dev
The next step is to create you own grunt JavaScript file Gruntfile.js. The JS file will have following code:
module.exports = function(grunt) {
grunt.loadNpmTasks('grunt-html-snapshot');
grunt.initConfig({
htmlSnapshot: {
all: {
options: {
snapshotPath: '/project/snapshots/',
sitePath: 'http://example.com/my-website/',
urls: ['#!/page1', '#!/page2', '#!/page3']
sanitize: function (requestUri) {
//returns 'index.html' if the url is '/', otherwise a prefix
if (/\/$/.test(requestUri)) {
return 'index.html';
} else {
return requestUri.replace(/\//g, 'prefix-');
}
},
//if you would rather not keep the script tags in the html snapshots
//set `removeScripts` to true. It's false by default
removeScripts: true,
}
}
}
});
grunt.registerTask('default', ['htmlSnapshot']);
};
Run the task using the command grunt htmlSnapshot.
F - Importance of Site Maps
Fine-tuning of site maps is also essential for having finer control over how search engine bots access your site. Whenever a search engine bot finds example.com/sitemap.xml, it follows the links given in the sitemap before blindly following all the links of the website. This is the best way if you want to index a page that is not linked to any other page.
For AJAX content, in order to get your search engine's indexes properly set up, it is advised to list all the pages/URLs that your app generates, even if your app is a single page app. Below is a sample sitemap:
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
...
<url>
<loc>http://www.yourwebsite.com/#!/page1</loc>
<changefreq>daily</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>http://www.yourwebsite.com/#!/page2</loc>
<changefreq>daily</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>http://www.yourwebsite.com/#!/page3</loc>
<changefreq>daily</changefreq>
<priority>1.0</priority>
</url>
...
</urlset>
G - AngularJS Awesomeness
Now, you should be all set to go and experience the awesomeness of JS and its amazing features. This trend can only go upwards from here. With the concept of AJAX indexed content, you can nearlydo anything. Happy coding!
Opinions expressed by DZone contributors are their own.
Trending
-
What Is Envoy Proxy?
-
13 Impressive Ways To Improve the Developer’s Experience by Using AI
-
Application Architecture Design Principles
-
What Is JHipster?
Comments