Over a million developers have joined DZone.

Visualizing Search Keyword Overlap

Visualizing how often search keywords overlap when looking for pest control experts.

· Big Data Zone

Learn how you can maximize big data in the cloud with Apache Hadoop. Download this eBook now. Brought to you in partnership with Hortonworks.

The other day someone asked me to look at the search data for Debug Pest Control, a pest management company based in Rhode Island. One of the things I wanted to visualize was how the search terms overlapped with each other.

To do this, I created a graph where the nodes are the keywords and edges join nodes that share a word. (A “keyword” is actually a set of words, whatever someone types into a search.) The result was a solid gray blob be cause the the nodes are so densely connected.

I thinned the graph by first only looking at keywords that have occurred at least twice. Then I made my connection criteria a little stronger. I take the union of the individual words in two keywords and create a connection if at least 1/3 of these words appear in both keywords.

(You can click on the image to see a larger version.)

The area of each red dot is proportional to the number of times someone has come to the site using that keyword. You can see that there’s long-tail behavior: while some dots are much larger than others, there are a lot of little dots.

In case you’re curious, these are the top keywords:

  1. pest control
  2. ri
  3. pest control ri
  4. pest control rhode island,
  5. plants that keep bees away
  6. poisonous snakes in ri
  7. plants that keep bees and wasps away
  8. plants to keep bees away
  9. plants that keep wasps away
  10. pest control companies 

The keywords were filtered a little before making the graph. Some of the top keywords contained “debug.” These may have been people using the term to refer to eliminating pests from their property, but more likely they were people who knew the company name. By removing “debug” from each of the keywords, the results focus on people searching for the company’s services rather than the company itself.

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Topics:
visualization ,search engine

Published at DZone with permission of John Cook, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}