The Web's Cruft Problem
The Web's Cruft Problem
Follow the money and you'll understand why so many of today's websites are full of "cruft" aka. crap. It's slowing our mobile experiences way down.
Join the DZone community and get the full member experience.Join For Free
Learn how error monitoring with Sentry closes the gap between the product team and your customers. With Sentry, you can focus on what you do best: building and scaling software that makes your users’ lives better.
The other day I came across this tweet from Kyle Simpson:
Is there a term (other than "privilege") for hating the web more and more as a user but liking it more as a developer? asking for a friend.
— getify (@getify) June 17, 2015
I don’t have a term for Kyle, but I completely agree with the sentiment, and I anecdotally get this sentiment from the greater web community. Between modals, app-install prompts, mobile web fails, ads, mobile redirects, EU cookie prompts, and the like, web developers—the people who collectively create the web—increasingly hate actually using the web.
In this article I’m going to argue that the reason for this hatred, and the biggest problem the web faces today is “cruft”: a term I mean to collectively refer to all the crap that the average web page includes that does not contribute towards what the user is actually trying to accomplish — read an article, buy a product, and so forth. I’ll also argue that the cruft problem is largely caused by a greater monetization problem that the web faces today.
To put this all in context let’s look at an example.
Original author: TJ VanToll
Cruft in Action
The other day I saw this article on psychopaths appear on my Twitter stream. I opened up the article in Chrome for iOS and here’s what I saw:
This article makes for a good showcase of web cruft. All I wanted to do was read about psycopaths, as one does, but before reading I had to sift through a bunch of junk that I don’t care about—like social buttons, the temperature, and a terms-of-service modal — all for an article that’s about 2,000 words. I can’t even see the start of the article on my oversized iPhone 6+.
Loading this article took 200+ HTTP requests and used ~2MB of data. The article took about 3 seconds to load on my WiFi, and web page test says it would take about 13 seconds to load on an average mobile network.
I don’t bring up this example to single out CNN, because, as sad as this is to say, this article is now representative of the average web experience. According to the http archive, the average web page surpassed 2MB this May, and is now at 2.08MB. It’s not hard to find a far worse example out there.
Why the Cruft?
I’m not the first to talk about the web’s cruft problem. Peter-Paul Koch recently described it as such:
“[W]eb versions of the articles have an extra layer of cruft attached to them, and that’s what makes the web slow to load. The speed problem is not inherent to the web; it’s a consequence of what passes for modern web development. Remove the cruft and we can compete again.”
And later he poses the question I’d like to tackle:
“The interesting question is: why all the cruft?”
PPK goes on to argue that the cruft is caused by tools, or more specifically libraries, frameworks, and the like that are increasingly used on the web. I agree with this to a point, as unnecessary tool usage can absolutely have adverse effects on page weight and loading times, but I believe there’s a more systematic problem at play here.
Let’s dive deeper into the CNN article. Among the 200+ HTTP requests the page makes are calls to 25 different domains.
Yes you read that correctly. TWENTY…FIVE. Among them are a few that are clearly ad related (ex. ad.doubleclick.net, pixel.moatads.com), a few that serve some analytics function, and many whose names are intentionally obfuscated to confuse us.
To me, you could phrase the “why all the cruft?” question a different way: since minimizing HTTP requests is one of the best known mobile web performance best practices, why do many mobile web sites flagrantly violate this rule?
You can certainly argue that part of the reason is tools, as using drop-in ads, social media widgets, and such will generate more HTTP requests than something hand-crafted. However I believe the answer has to do with money.
Follow the Money
Why does CNN show ads? To make money. Why does CNN include tracking services? To learn more about the reader, to show more targeted ads, to make more money. Why does CNN use social media buttons? To get people to share the article, to get more page views, to get more ad views, to make more money. Why does CNN include a weather widget? Ok I don’t get that one; they should really get rid of that.
Again, I don’t mean to call out CNN as the “bad example,” but rather use them to show a specific example of a model that has become pervasive for content on the web.
My friend Brian Rinaldi recently wrote that the content model of the web is broken, in which he argues that we as web users thoroughly devalue content and writers. He argues that because we refuse to pay for content, content producers must resort to increasingly drastic tactics to make money off the content they produce — or have some ulterior motives to make the content production possible.
Paywalls have failed (mostly), so we’re left with a bunch of sites that use an eclectic set of ads, tracking scripts, modals, and such, all in an attempt to scrape together enough revenue to fund the content that lives behind the cruft.
What’s Being Done?
Many people are attacking the cruft problem, but interestingly the innovation is mostly coming from outside of the browser world.
Flipboard was perhaps the first big successful attempt at fixing cruft on the web. Flipboard essentially takes content from around the web, provides excerpts, and links off to the corresponding articles for the full content. This provides a rather nice browsing experience, where you don’t have to load a full article, and all the cruft that comes with it, just to get a quick preview of what the article looks like. For instance here’s a preview of a Fenway Park article shown in the Flipboard iOS app:
But what’s more interesting is that Flipboard has gone beyond this content preview role, and now partners with certain content providers to display articles directly within the Flipboard app — foregoing the browser entirely. As an example, here’s the same psychopath article loaded in the Flipboard iOS app:
Here I see the exact same content, although unlike the browser version, the Flipboard version of this this article is decidedly cruft free. Also unlike the browser version, this article loads nearly instantaneously, and a quick look at the network traffic shows why:
Whereas the browser version used 200+ HTTP requests from 25 domains, the Flipboard version uses 4 HTTP requests from 2 domains: cdn.flipboard.com and ad.flipboard.com.
As a company Flipboard has been successful, boasting 2014 numbers of 100 million+ active readers and a near billion dollar valuation, showing that there’s clearly a demand for a better means of reading content on mobile devices than what the browser currently provides. Flipboard’s success hasn’t gone unnoticed, and its business model has been more or less copied by a few others.
This May Facebook launched Instant Articles. Here’s how they introduce their service:
“As more people get their news on mobile devices, we want to make the experience faster and richer on Facebook. People share a lot of articles on Facebook, particularly on our mobile app. To date, however, these stories take an average of eight seconds to load, by far the slowest single content type on Facebook. Instant Articles makes the reading experience as much as ten times faster than standard mobile web articles.”
Sounds a lot like Flipboard doesn’t it? If you’re a publisher and you opt in, you let Facebook control the distribution of your content, in return for a far more performant experience for your readers, and presumably shared ad revenue of some sorts.
Let’s look at an example to see what this looks like in action. BuzzFeed participates in Instant Articles, and they recently published an article on 13 steps to instantly improve your day. As you might expect, the mobile web version of this article is laden with cruft, including two different plugs for their iOS app, social buttons, ads, and more:
Like the CNN article we looked at earlier, BuzzFeed’s site is littered with ad scripts, tracking scripts, and the like. The final total comes out to 200+ HTTP requests and about 4MB worth of data:
Let’s compare that to the same experience on Facebook’s Instant Articles, shown below:
As with with the previous Flipboard example, although the content is the same, this BuzzFeed article is remarkably cruft free. It also loads essentially instantaneously, and in my testing loading this article required just five network requests. (Facebook also appears to be employing some sort of prefetching algorithm to load parts or all of articles before you click on them, as the loading really does feel “instant.”)
There’s an argument to be had over whether this is a good business model for publishers, and it’s one that’s probably taking place in numerous conference rooms right now. But it’s hard to deny that Flipboard and Instant Articles provide a really elegant reading experience on mobile devices — something that mobile browsers have struggled with.
Flipboard and Facebook aren’t the only players in this game, as perhaps the biggest player in the tech world, Apple, announced that they’re entering this space with Apple News: a news app that uses essentially the same business model as Flipboard and Instant Articles.
Is this the End of the Web?
No. It’s important to remember that regardless of how good of a user experience Flipboard, Facebook, and Apple provide, they’re still proprietary solutions controlled by a single company. These companies control how the content is used, and how people are able to access it. If Apple wants to lock down their content to Apple-created devices, they can do that.
Plus these companies all require some level of partnership with content providers to appear within their ecosystem, meaning these apps will only ever have a tiny fraction of the content the web provides. The ease of publishing and sharing on the web gives it a massive advantage over these proprietary solutions.
That being said, it’s hard to argue that the browser provides a better reading experience than what Flipboard and Facebook Instant Articles provide in their native apps. John Gruber might have said it best, in an article written in response to Facebook’s Instant Articles:
“I’m intrigued by the emphasis on speed. Not only is native mobile code winning for app development, but with things like Instant Articles, native is making the browser-based web look like a relic even just for publishing articles. If I’m right about that, it might pose a problem even for my overwhelmingly-text work at Daring Fireball. Daring Fireball pages load fast, but the pages I link to often don’t. I worry that the inherent slowness of the web and ill-considered trend toward over-produced web design is going to start hurting traffic to DF.”
What’s Being Done?
There are a few developments that should help the current situation for the web.
The recently published HTTP/2 specification offers to substantially decrease latency on the web serving compressed HTTP headers, and loading resources in parallel over a single TCP connection. Once implemented in browsers, HTTP/2 should substantially lower the loading times of sites that rely on a large number of HTTP requests, such as the ones shown in this article.
Last year, Google announced that they’ll penalize sites that aren’t mobile-friendly in their search results, as well as display a little “mobile-friendly” text next to sites that meet their guidelines in their search results.
This is a small tweak, but one that incentivizes developers and publishers of content sites, whom are often highly dependent on search-engine traffic, to keep their cruft to a minimum. And early research shows that it seems to be having a noticeable impact on search results.
Opera Mini has long been successful acting as a proxy browser, caching resources on its servers to reduce the amount of data that needs to be sent to each individual device. Chrome for Android and iOS now includes a similar option, and although it’s off by default, it’s still an option users have to help speed up the web.
For many, ad blockers are the primary tool for attacking the web’s cruft on desktop devices, but they have yet to make their way into users mobile workflow, largely because mobile OS vendors have actively prevented them.
Google has a pretty good reason to actively discourage ad blockers, as they derive something like 80%–90% of their revenue from online advertising. Google made news back in 2013 for removing AdBlock Plus from Google Play.
Historically Apple has also prevented ad blockers, but that’s about to change. Apple recently made news by announcing that they’ll be opening an ad-filtering API to be used in Safari as of iOS 9. Regardless of whether this is a potential attack on Google, or a goodwill gesture towards Safari users, the end effect is reduced cruft for iOS users that opt-in to ad-blocking apps.
Despite these various cruft-reducing features, I still believe this is an area ripe for innovation in the browser space. Why is it that as a publisher of articles your only real monetization option is injecting bulky ads that produce a worse experience for everyone?
I don’t have an answer here, and far smarter people than me have spent years trying to solve this problem. But still, it seems crazy that this is the best we can do. I still believe in the open web, and it’s not going anywhere anytime soon, but we really need to start thinking of ways we can start to clean this mess up.
Opinions expressed by DZone contributors are their own.