Over a million developers have joined DZone.

HTML5/AngularJS/Nginx Crawlable Application

DZone's Guide to

HTML5/AngularJS/Nginx Crawlable Application

· Java Zone ·
Free Resource

Java-based (JDBC) data connectivity to SaaS, NoSQL, and Big Data. Download Now.

Full Ajax

A lot of Java web applications and Java web frameworks use an architecture that does not allow separate UI and back-end development. Thus, there is no way to separate your team of highly specialized front-end and back-end developers into the UI team and back-end team. Regardless of the preferences of the developer he has to understand how presentation and business logic works. It's great if a UI developer knows not just data models (which connect application templates and controllers) and how to run the server. In particularly bad cases, a UI developer would need to re-build an entire application when changing a few lines of JavaScript, or the language of the JSP files when he wants to correct CSS. Also, he would write and transfer HTML files on the server, affecting performance of the server and the network.

Nowadays, modern browsers (with HTML5, WebSocket, etc.) no longer need to score the back-end server with something different from business logic. Now, UI development can be carried out on a simple Nginx server with API-stubs instead of a real back-end server. Frameworks for documentation auto-generation (like JSONDoc) also help UI and back-end developers to reduce the cost of communication. Transferring pure JSON data also significantly reduces the load on back-end servers. After all, the compressed JavaScript code of the UI client can be kept in the browser's cache (reducing the load on the network and Nginx).

But if modern browsers can easily handle the increased liability costs, search engines need a little help.

To properly index AngularJS applications we need the following things:

  • sitemap.xml
  • Angular's HTML5 Mode
  • Nginx
  • An old-fashioned back-end server


HTML5 mode turns AngularJS routes like example.com/#!/Home into routes like example.com/home (the href attribute must also be declared without the hashbang).

Activate the html5Mode in AngularJS:


The hashbang goes for compatibility with browsers that do not support HTML5 URLs.

Now we need to make our Nginx server follow requests from example.com/home to the main index.html file for the application. To do this, we point out the following directive in the config file:

location / {
expires -1;
add_header Pragma "no-cache";
add_header Cache-Control "no-store, no-cache, must-revalidate, post-check=0, pre-check=0";
root /var/web;
try_files $uri $uri/ /index.html =404;

The string try_files $uri $uri/ /index.html =404; means that now all non-existent URLs will be forwarded to the index.html file, but without rewriting the URL in the browser address bar. This solution is already working (and also compatible with the old format hashbang references), and if your application should not be indexed by search engines then you can stop here.


Now, we will help search engines process our application correctly. To do this, we will prepare hints for search-bots and generate snapshot pages. For a start we will tell the bot how you want to index the page with a sitemap.xml file. The simplest version of a file is listed bellow (a link to the page, and the date of the last update; the more detailed format is on the site www.sitemaps.org.

<urlset xmlns='http://www.sitemaps.org/schemas/sitemap/0.9'>
Great! The search engine will request referrals from our website and receive content as  index.html. But JavaScript processing is not integrated into bots. We will tell the bot that there is real content under  index.html. To do this, add to the  <head> of the page:
<meta name="fragment" content="!" />
This will give the bot an opportunity to take the next step. Seeing  fragmet=! bot will request the page again, but it will add an ?_escaped_fragment_= parameter to the tail of the URL. Nginx will forward that request with parameters to a different location:
if ($args ~ "_escaped_fragment_=(.*)") {
rewrite ^ /snapshot${uri};
location /snapshot {
proxy_pass http://api;
proxy_connect_timeout 60s;

That's it. Now all requests from bots will see snapshot responses from the back-end server.

Real URL Bot URL
Backend URL







To build a snapshot I use AngularJS views and the Thymeleaf view framework. Since Thymeleaf and AngularJS support HTML5 tag attributes you can even use a single template file, but I prefer not to mix them. A line of HTML would look like this:<div ng-bind="text" th:utext="${text}"></div>.A single file, cool!

Done. Now the search bot will request the necessary references and index them properly. For now "Fetch as bot" in Google and Bing webmaster tools does not support fragmet=!, so you can't immediately check if you have everything configured right, and you should wait until a bot comes to your app.

Connect any Java based application to your SaaS data.  Over 100+ Java-based data source connectors.


Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}