Over a million developers have joined DZone.

Capturing a Web Page Without Stylesheets

DZone's Guide to

Capturing a Web Page Without Stylesheets

· Web Dev Zone
Free Resource

Learn how to build modern digital experience apps with Crafter CMS. Download this eBook now. Brought to you in partnership with Crafter Software

It is amazing to live in an environment where the Internet connection is ubiquitous and fast. But in case the tube is having a problem and the bits from the web server are broken into random pieces, how does the web site look like? If the content degrades gracefully, the lack of style sheets may reduce the attractiveness of the page but it should not significantly hamper the experience. Fortunately, there is a way to automatically check the appearance of a web page under that circumstance.

Some time ago, I have demonstrated the use of PhantomJS, headless WebKit, to capture web pages programmatically. The example was also extended to capture just a particular portion of the page via clipping. For CSS-less capture, we just need to extend it with the new feature in PhantomJS 1.9 (as implemented by Vitaliy Slobodin): the ability to abort network requests.

There is a example loadurlwithoutcss.js which demonstrates this feature. In fact, combining this idea with the previous BBC News site capture, we can come up with the following screenshots. The left side shows the normal page (see my previous blog post on web clipping) while the right side demonstrates what happens when all the CSS files are not loaded at all.


The script which produces the above image is as follows:

var page = require('webpage').create();
page.settings.userAgent = 'WebKit/534.46 Mobile/9A405 Safari/7534.48.3';
page.settings.viewportSize = { width: 400, height: 600 };
page.onResourceRequested = function(requestData, request) {
    if ((/http:\/\/.+?\.css$/gi).test(requestData['url'])) {
        console.log('Skipping', requestData['url']);
page.open('http://m.bbc.co.uk/news/health', function (status) {
    if (status !== 'success') {
        console.log('Unable to load BBC!');
    } else {
        window.setTimeout(function () {
            page.clipRect = { left: 0, top: 0, width: 400, height: 600 };
        }, 1000);

It is pretty similar to its previous version. The new addition is a handler for onResourceRequested where we detect the URL for a style sheet and abort its loading. If the script is executed, it will display the message:

Skipping http://static.bbci.co.uk/frameworks/barlesque/2.45.9/mobile/3.5/style/main.css
Skipping http://static.bbci.co.uk/bbcdotcom/0.3.184/style/mobile/bbccom.css
Skipping http://static.bbci.co.uk/news/1.7.1-259/stylesheets/core.css
Skipping http://static.bbci.co.uk/news/1.7.1-259/stylesheets/compact.css

which indicates that these 4 (four) style sheets won’t be part of the rendered output.

The entire process is rather straightforward. Because PhantomJS is cloud-ready, you can even have it running on an instance of Amazon EC2. It should not be too difficult to include this type of spartan rendering of your web site as another layer in the defensive development workflow.

What do you plan to de-CSS-ify today?

Crafter is a modern CMS platform for building modern websites and content-rich digital experiences. Download this eBook now. Brought to you in partnership with Crafter Software.


Published at DZone with permission of Ariya Hidayat, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.


Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.


{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}