Over a million developers have joined DZone.

A Typical HTML Page

DZone 's Guide to

A Typical HTML Page

· Web Dev Zone ·
Free Resource

When doing DOM-based performance testing you frequently need to pick a sample HTML document to work against. This raises the question: What is a good, representative, HTML document?

For many people a good document seems to file into one of two categories:

  1. A large web page with a lot of content. When we did our initial performance testing with jQuery we used Shakespeare's As You Like It (lots of elements, but a very flat structure) - Mootools uses an old draft of the W3C CSS3 Selectors recommendation. This has a lot of content, as well - thousands of elements with a medium depth structure.
  2. A popular web page. Common recommendations include 'yahoo.com' and 'microsoft.com'.

What's troubling is that there doesn't really seem to be any way to determine what a representative web page actually is. There's a couple things that I'd like to propose as being good indicators:

  • Standards-based semantic markup (including strong use of attributes: id, class, etc.).
  • Non-trivial file size and element count (testing the scalability of the performance).
  • Some use of tables and form elements (frequent inclusions in most web pages).
  • Strong use of CSS (frequently implies a deep element structure, in order to accommodate complex layouts).
  • Pervasive use of JavaScript. If JavaScript performance analysis is being done it's probably good to start with a page that already has a desire to use it.

I think, out of all of these aspects, one page stands out: CNN.com. Here's why:

  • It uses semantic HTML 4 markup with a lot of classnames and ids.
  • It's about 92kb in size and has 1648 elements in it.
  • It has some tables (seemingly for legacy material) and some forms (search forms, drop-downs).
  • Lots of CSS and JavaScript. Makes good use of Prototype which, at least, should show an desire in having a page with performant JavaScript.
  • It's, also, imperfect. I would consider this to be desirable. Rarely are pages completely without fault - and the heavy use of embedded JavaScript, ads, and non-semantic tables helps to add a stark dash of reality.

Of course, analysis could always be done against multiple pages and be viewed in aggregate but we need a place to start. So, what do you think; is CNN a good, representative, page for doing performance analysis against?


Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}