Richer presentation. Deeper data in search results.
Search UIs are getting richer and more interesting by the day. Google page snapshots, animated disclosures of richer data on mouse over, and endless pagination are great examples. Often this type of functionality has a dark side. As each of these features is layered on and integrated with our HTML, it can begin to feel like death by a thousand paper cuts.
Once we have separated presentation from data we can begin to apply tools like design patterns (Visitor comes to mind) that make our code easier to read and modify.
Pass the buck on rendering
Cloud meet Ocean.
On demand pages are expensive to render, and search is solidly in this category. Popular searches can be cached at the HTML level, but this is only only effective for very simple search behaviors and comes with its own set of problems. Cache expiry for content and what to do with Solr caching are two of these complications.
Even more server capacity can be reclaimed by letting the app server be a proxy for JSON results right out of Solr. This is an excellent way to get the most out of Solr’s own query caching.
Need more speed? Expiry headers and proxy caches like Varnish or Squid can still be used to further speed up the resulting JSON or XML requests.
Decoupling for Scaling
Better yet, what if we could scale search independently of content? Using techniques like Cross Origin Resource Sharing (CORS) or techniques like JSONP we can remove our app servers from the stack and go directly to the Solr servers.
Decoupling gives us much more freedom to make good scaling choices in server configuration. We can optimize search servers around serving small JSON documents and our app servers around supporting more traditional web content.
It’s also profoundly great for testing! Expectations for and of the search servers can be verified independent of the app server.
I’m assuming you don’t want marketers injecting their sales pitches into your search results.
Solr doesn’t really have a provision for users and privileges which might have horrified you earlier when I implied you could make your Solr server public-facing. Fortunately this is easily managed.
Configure access to Solr endpoints on the web server. Some things to keep in mind:
- Lock down any /update or /admin requests at this level
- If you don’t plan on blocking /update and /admin here, move them to a different endpoint than the traditional Solr URI scheme and restrict access to an IP range or require authentication
- Make sure to use invariants in the Solr server configuration if there are filter rules that must happen
- Worried about someone requesting 1 million rows of data? Look into Apache’s mod_header or write a simple proxy app. This also works for preventing deep dives (start = 50000000)
SEO and Accessibility
You don’t have to worry about SEO on search pages.
Don’t believe me? Good. Who knows how Google and friends ultimately decide which content matches a query? If your SEO-practor says results pages matter then you will need a “Plan B.”
And while you are at it, this page can be really helpful as a fallback to unconventional browsers like visitors with disabilities.
Check out the ajax-solr project if you’d like to start working with some of these methods.