Platinum Partner

Compete.com Webstats Scrape Groovy

// description of your code here
This is a script for collecting webstats data from compete.com. The scripts takes as input the list of domains that you want to analyze and outputs the compete.com webstats data.

import com.gargoylesoftware.htmlunit.WebClient
import com.gargoylesoftware.htmlunit.BrowserVersion

def domainList = (new File("/root/Desktop/Morningstar/AlexaTop3000.txt")).readLines()
def outFile = new File("/root/Desktop/Morningstar/CompeteStats3000.csv")
outFile.delete()
def wc = new WebClient( BrowserVersion.FIREFOX_3_6 )

domainList.each {
  def domainName = it.trim()
  println domainName
  def url = "http://siteanalytics.compete.com/export_csv/${domainName}/"
  def page = wc.getPage( url )
  def pageLines = page.getContent().split("\n")

  def lineCount = 0
  pageLines.each { line ->
   if ( lineCount > 3 ) {
     outFile.append( "\"${domainName}\",${line}\n" )
   }
   lineCount++
  }
  sleep( 400 )
}
{{ tag }}, {{tag}},

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}
{{ parent.authors[0].realName || parent.author}}

{{ parent.authors[0].tagline || parent.tagline }}

{{ parent.views }} ViewsClicks
Tweet

{{parent.nComments}}