Automatically Collect and Process Visitors’ IP Addresses
Join the DZone community and get the full member experience.
Join For Free(NOTE: A version of this article posted previously contained incorrect information. The below version corrects those errors. The author apologises for any inconvenience.)
I’m not a programmer. I’m an art collection manager with a fierce DIY streak that has helped me to develop a database application, and build and manage a website that incorporates it. Ever since I accidentally lobotomised my first Windows 3.1 computer, I’ve taught myself how to seek out, find and apply the information I need, sometimes through long hours of trial and error, and I owe almost all of it to the Internet. If it weren’t for people’s willingness to share information for free on innumerable forums and websites like this one, I would not have been able even to scratch the surface of completing the sorts of tasks that are now behind me.
In that spirit of sharing, I thought I might humbly offer the solution I’ve cobbled together from various sources, and tweaked to automate tasks related to collecting the IP addresses of visitors to my website. I’m sure it’s nothing earth-shattering to an experienced coder, but it works well for me, and I’ve never seen a complete solution like it presented anywhere on the Internet before. It is a Windows-centric solution, since that’s the platform I’ve always used.
The first step is to gather and record the IP addresses of website visitors. I’ve chosen to do this using php. Insert this code into the html of each page for which you’d like to capture IP addresses, just before the closing </body> tag:<?php if (!empty($_SERVER['HTTP_CLIENT_IP'])) //check ip from share internet { $ipaddress = $_SERVER['HTTP_CLIENT_IP']."\r\n"; } elseif (!empty($_SERVER['HTTP_X_FORWARDED_FOR'])) //to check if ip is pass from proxy { $ipaddress = $_SERVER['HTTP_X_FORWARDED_FOR']."\r\n"; } else { $ipaddress = $_SERVER['REMOTE_ADDR']."\r\n"; } $file = 'filename.txt'; //this is the file to which the IP address will be written; name it your way. $fp = fopen($file, 'a'); fwrite($fp, $ipaddress); fclose($fp); ?>
If you want to record the IP addresses for each webpage to a different file, use a different name each time for filename.txt in the above example.
Next, create the blank filename.txt file(s) in the same directory of your web server in which these html files reside. Now each time a visitor loads these pages, their IP address will be written to the text file(s) you’ve indicated.
Next, you’ll need a way to download the text file(s) from the server to your local machine. If you’re writing to multiple files on the server, I’ve found it’s helpful to download them separately, then combine them into one list. Also, I like to sort the list and remove duplicate entries (you’ll see why a little later). Following is a Visual Basic script to do all of that. Let’s designate it C:\Folder\Subfolder\DloadCmbnDdupe.vbs. In the script below, substitute YourWebsite.com with the domain name of your website, C:\Folder\Subfolder with the actual location on your local computer and filename*.txt with the file name(s) on the web server to which you’ve chosen to write.Option Explicit On Error Resume Next Download "http://www.YourWebsite.com/filename.txt", _ " C:\Folder\Subfolder\filename.txt " Download "http://www.YourWebsite.com/filename2.txt", _ " C:\Folder\Subfolder\filename2.txt " Download "http://www.YourWebsite.com/filename3.txt", _ " C:\Folder\Subfolder\filename3.txt " CmbnDdupe() If Err <> 0 Then Wscript.echo "Error Type = " & Err.Description End If WScript.Quit '----------------------------------------------------------------------------------------- Function Download(strURL, strPath) Dim i, objFile, objFSO, objHTTP, strFile, strMsg Const ForReading = 1, ForWriting = 2, ForAppending = 8 Set objFSO = CreateObject("Scripting.FileSystemObject") If objFSO.FolderExists(strPath) Then strFile = objFSO.BuildPath(strPath, Mid(strURL, InStrRev(strURL, "/") + 1)) ElseIf objFSO.FolderExists(Left(strPath, InStrRev(strPath, "\") - 1)) Then strFile = strPath Else WScript.Echo "ERROR: Target folder not found." Exit Function End If Set objFile = objFSO.OpenTextFile(strFile, ForWriting, True) Set objHTTP = CreateObject("WinHttp.WinHttpRequest.5.1") objHTTP.Open "GET", strURL, False objHTTP.Send For i = 1 To LenB(objHTTP.ResponseBody) objFile.Write Chr(AscB(MidB(objHTTP.ResponseBody, i, 1))) Next objFile.Close() End Function '----------------------------------------------------------------------------------------- Function CmbnDdupe() Dim shell Set shell=createobject("wscript.shell") shell.run "CmbnDdupe.bat" Set shell=nothing End Function
You’ll notice that Function CmbnDdupe calls a batch file, CmbnDdupe.bat, in the same directory (C:\Folder\Subfolder). Here it is, below. Again, substitute filename*.txt with the file name(s) you used at the beginning.
@echo off for %%x in (filename.txt) do type %%x>>templist for %%x in (filename2.txt) do type %%x>>templist for %%x in (filename3.txt) do type %%x>>templist ren templist IPlist.txt setlocal disableDelayedExpansion set file=IPlist.txt set "sorted=%file%.sorted" set "deduped=%file%.deduped" ::Define a variable containing a linefeed character set LF=^ ::The 2 blank lines above are critical, do not remove sort "%file%" >"%sorted%" >"%deduped%" ( set "prev=" for /f usebackq^ eol^=^%LF%%LF%^ delims^= %%A in ("%sorted%") do ( set "ln=%%A" setlocal enableDelayedExpansion if /i "!ln!" neq "!prev!" ( endlocal (echo %%A) set "prev=%%A" ) else endlocal ) ) >nul move /y "%deduped%" "%file%" del "%sorted%" exit
This routine combines the downloaded files into one (templist), then renames it to IPlist.txt. It then sorts the IP addresses into ascending order, saving to IPlist.txt.sorted, and then removes any duplicates, saving to IPlist.txt.deduped. Finally, it moves (overwrites and deletes) IPlist.txt.deduped to IPlist.txt and deletes IPlist.txt.sorted, leaving behind IPlist.txt (the sorted and de-dupe-ified list).
Now, we have a list of one IP address per visitor to the pages from which we’re collecting. At this point I like to ping each IP address to collect whatever information is available about it. This is why I remove the duplicate entries, which are caused by a visitor viewing more than one of the collecting pages, or by a visitor returning to the pages. I don’t need to waste time and bandwidth pinging the same IP address more than once. If I want to see which IPs visited which pages multiple times, I can always just look at filename.txt, filename2.txt and filename3.txt. I’ve named the ping routine PingList.vbs, and put it in C:\Folder\Subfolder. Here it is:Option Explicit On Error Resume Next Dim srcFile srcFile = "IPlist.txt" PingList(srcFile) If Err <> 0 Then Wscript.echo "Error Type = " & Err.Description End If WScript.Quit '----------------------------------------------------------------------------------------- Function PingList(srcFile) Dim objFSO Dim objShell Dim strCommand Dim opnFile Dim strText Dim logFile Set objFSO = CreateObject("Scripting.FileSystemObject") Set objShell = Wscript.CreateObject("Wscript.Shell") logFile = "Log.txt" If objFSO.FileExists(srcFile) Then Set opnFile = objFSO.OpenTextFile(srcFile, 1) Do While Not opnFile.AtEndOfStream strText = opnFile.ReadLine If Trim(strText) <> "" Then strCommand = strText objShell.run "%comspec% /c ping -a -n 1 " & strText & " >> " & logFile, , True End If Loop opnFile.Close Else WScript.Echo "File '" & srcFile & "' was not found." End If End Function
This script pings each IP address in IPlist.txt, resolves the hostname if possible and writes the results to Log.txt in the same directory. Go ahead and create a blank Log.txt now in C:\Folder\Subfolder.
The next thing you’ll want to do is clear the contents of the text files on the server, so that they will retain only the IP addresses from new visits. Create the following file in C:\Folder\Subfolder. I’ve named it Upload.cmd. It's critical that it has the .cmd file type.@echo off echo user YourUsername> ftpcmd.dat echo YourPassword>> ftpcmd.dat echo bin>> ftpcmd.dat echo cd /YourWebDirectory/>> ftpcmd.dat echo prompt>> ftpcmd.dat echo mput %1 %2 %3>> ftpcmd.dat echo rename filename_.txt filename.txt>> ftpcmd.dat echo rename filename 2_.txt filename2.txt>> ftpcmd.dat echo rename filename 3_.txt filename3.txt>> ftpcmd.dat echo quit>> ftpcmd.dat ftp -n -s:ftpcmd.dat ftp.YourWebsite.com del ftpcmd.dat exit
Substitute YourUsername and YourPassword with the username and password with which you access your website files, and YourWebDirectory with the location of your website files on the server. In C:\Folder\Subfolder, create the blank text file(s) that will overwrite the ones on the server. Give them a different name (for instance, add an underscore), as you’ll want to distinguish them from the files you downloaded at the beginning of this exercise. Hence, the blank filename_.txt will be copied to the server as filename.txt, overwriting the existing file. The number of per-cent-sign-plus-integer combinations (variables) needs to correspond with the number of files you upload and overwrite; in this case, three (%1 %2 %3 = filename_.txt, filename2_.txt and filename3_.txt). Substitute YourWebsite.com for the domain name of your website.
Before moving to the final step, create a blank text file named ErrorLog.txt in C:\Folder\Subfolder. This is where we’ll record any errors encountered during the execution of the combined routines.
Now to put it all together and automate it. Create the following batch file (I’ve name it DLPingUL.bat) and put it in C:\.@echo off title Download Ping Upload - Scheduled task, please wait cd "C:\Folder\Subfolder" start "" /wait CScript DloadCmbnDdupe.vbs 2>> ErrorLog.txt start "" /wait Upload.cmd filename_.txt filename2_.txt filename3_.txt 2>> ErrorLog.txt copy /d /y /a IPlist.txt NewIPlist.txt /a 2>> ErrorLog.txt start "" /wait CScript PingList.vbs 2>> ErrorLog.txt del IPlist.txt exit
The reason we’ve put this in C:\ is so that Windows Task Scheduler will have no problem with permissions when running it. This routine moves to the directory in which you’ve stored all of the relevant files (C:\Folder\Subfolder, substitute with the actual location); executes the Visual Basic script that creates the list; executes the upload, passing the file names of the blank replacement files to the echo mput %1 %2 %3 command; copies IPlist.txt to NewIPlist.txt, overwriting the latter if it exists (this is so you have a list to which to refer if you want); and executes the VB Script to ping and record the results. Finally, it deletes IPlist.txt, as it needs to be created programmatically each time. Any errors are recorded in C:\Folder\Subfolder\ErrorLog.txt.
The final step is to create a new task in Task Scheduler that runs C:\DLPingUL.bat at a time of your choosing. I run it every Saturday at 3:00 am, so that when I wake up I have C:\Folder\Subfolder\Log.txt waiting for me with all of its pinged IP address information. Having the hostname can be especially helpful; it can show you which bots and spiders crawled your site, or from which corporation the visit originated. The only manual task I do is any further research on those IP addresses, like running them through Whois or whatismyipaddress.com/ip-lookup . These sites help me to determine, for example, if an IP address is static or dynamic, or if it is associated with hackers or spammers, among other useful bits of information. When I’m finished, I clear the contents of Log.txt so that it is ready for next time.
If you’ve read all the way down to here then you’ve been very patient with me, and for that I thank you kindly.
Addendum: For best results, when copying the above code, click on "View Source" in the upper right corner of the code box, and copy and paste the source. This will ensure consistency.
_____________________________________________
Phillip Schubert is the founder of Schubert & Associates
www.schubertassociates.com.au
Opinions expressed by DZone contributors are their own.
Comments