Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Automatically Collect and Process Visitors’ IP Addresses

DZone's Guide to

Automatically Collect and Process Visitors’ IP Addresses

· Web Dev Zone
Free Resource

Discover how to focus on operators for Reactive Programming and how they are essential to react to data in your application.  Brought to you in partnership with Wakanda

(NOTE: A version of this article posted previously contained incorrect information. The below version corrects those errors. The author apologises for any inconvenience.)

I’m not a programmer. I’m an art collection manager with a fierce DIY streak that has helped me to develop a database application, and build and manage a website that incorporates it. Ever since I accidentally lobotomised my first Windows 3.1 computer, I’ve taught myself how to seek out, find and apply the information I need, sometimes through long hours of trial and error, and I owe almost all of it to the Internet. If it weren’t for people’s willingness to share information for free on innumerable forums and websites like this one, I would not have been able even to scratch the surface of completing the sorts of tasks that are now behind me.

In that spirit of sharing, I thought I might humbly offer the solution I’ve cobbled together from various sources, and tweaked to automate tasks related to collecting the IP addresses of visitors to my website. I’m sure it’s nothing earth-shattering to an experienced coder, but it works well for me, and I’ve never seen a complete solution like it presented anywhere on the Internet before. It is a Windows-centric solution, since that’s the platform I’ve always used.

The first step is to gather and record the IP addresses of website visitors. I’ve chosen to do this using php. Insert this code into the html of each page for which you’d like to capture IP addresses, just before the closing </body> tag:
<?php

if (!empty($_SERVER['HTTP_CLIENT_IP']))   //check ip from share internet
    {
      $ipaddress = $_SERVER['HTTP_CLIENT_IP']."\r\n";
    }
elseif (!empty($_SERVER['HTTP_X_FORWARDED_FOR']))   //to check if ip is pass from proxy
    {
      $ipaddress = $_SERVER['HTTP_X_FORWARDED_FOR']."\r\n";
    }
else
    {
      $ipaddress = $_SERVER['REMOTE_ADDR']."\r\n";
    }

$file = 'filename.txt';  //this is the file to which the IP address will be written; name it your way.

$fp = fopen($file, 'a');

fwrite($fp, $ipaddress);

fclose($fp);

?>

If you want to record the IP addresses for each webpage to a different file, use a different name each time for filename.txt in the above example.

Next, create the blank filename.txt file(s) in the same directory of your web server in which these html files reside. Now each time a visitor loads these pages, their IP address will be written to the text file(s) you’ve indicated.

Next, you’ll need a way to download the text file(s) from the server to your local machine. If you’re writing to multiple files on the server, I’ve found it’s helpful to download them separately, then combine them into one list. Also, I like to sort the list and remove duplicate entries (you’ll see why a little later). Following is a Visual Basic script to do all of that. Let’s designate it C:\Folder\Subfolder\DloadCmbnDdupe.vbs. In the script below, substitute YourWebsite.com with the domain name of your website, C:\Folder\Subfolder with the actual location on your local computer and filename*.txt with the file name(s) on the web server to which you’ve chosen to write.
Option Explicit
On Error Resume Next

    Download "http://www.YourWebsite.com/filename.txt", _
        " C:\Folder\Subfolder\filename.txt "
    Download "http://www.YourWebsite.com/filename2.txt", _
        " C:\Folder\Subfolder\filename2.txt "
    Download "http://www.YourWebsite.com/filename3.txt", _
        " C:\Folder\Subfolder\filename3.txt "

    CmbnDdupe()

    If Err <> 0 Then
        Wscript.echo "Error Type = " & Err.Description
    End If

    WScript.Quit
'-----------------------------------------------------------------------------------------
Function Download(strURL, strPath)

    Dim i, objFile, objFSO, objHTTP, strFile, strMsg
    Const ForReading = 1, ForWriting = 2, ForAppending = 8

    Set objFSO = CreateObject("Scripting.FileSystemObject")

    If objFSO.FolderExists(strPath) Then
        strFile = objFSO.BuildPath(strPath, Mid(strURL, InStrRev(strURL, "/") + 1))
    ElseIf objFSO.FolderExists(Left(strPath, InStrRev(strPath, "\") - 1)) Then
        strFile = strPath
    Else
        WScript.Echo "ERROR: Target folder not found."
        Exit Function
    End If

    Set objFile = objFSO.OpenTextFile(strFile, ForWriting, True)

    Set objHTTP = CreateObject("WinHttp.WinHttpRequest.5.1")

    objHTTP.Open "GET", strURL, False
    objHTTP.Send

    For i = 1 To LenB(objHTTP.ResponseBody)
        objFile.Write Chr(AscB(MidB(objHTTP.ResponseBody, i, 1)))
    Next

    objFile.Close()

End Function
'-----------------------------------------------------------------------------------------
Function CmbnDdupe()

    Dim shell

    Set shell=createobject("wscript.shell")

    shell.run "CmbnDdupe.bat"

    Set shell=nothing

End Function

You’ll notice that Function CmbnDdupe calls a batch file, CmbnDdupe.bat, in the same directory ( C:\Folder\Subfolder). Here it is, below. Again, substitute filename*.txt with the file name(s) you used at the beginning.
@echo off

for %%x in (filename.txt) do type %%x>>templist
for %%x in (filename2.txt) do type %%x>>templist
for %%x in (filename3.txt) do type %%x>>templist

ren templist IPlist.txt

setlocal disableDelayedExpansion
set file=IPlist.txt
set "sorted=%file%.sorted"
set "deduped=%file%.deduped"
::Define a variable containing a linefeed character
set LF=^


::The 2 blank lines above are critical, do not remove
sort "%file%" >"%sorted%"
>"%deduped%" (
  set "prev="
  for /f usebackq^ eol^=^%LF%%LF%^ delims^= %%A in ("%sorted%") do (
    set "ln=%%A"
    setlocal enableDelayedExpansion
    if /i "!ln!" neq "!prev!" (
      endlocal
      (echo %%A)
      set "prev=%%A"
    ) else endlocal
  )
)
>nul move /y "%deduped%" "%file%"
del "%sorted%"

exit

This routine combines the downloaded files into one (templist), then renames it to IPlist.txt. It then sorts the IP addresses into ascending order, saving to IPlist.txt.sorted, and then removes any duplicates, saving to IPlist.txt.deduped. Finally, it moves (overwrites and deletes) IPlist.txt.deduped to IPlist.txt and deletes IPlist.txt.sorted, leaving behind IPlist.txt (the sorted and de-dupe-ified list).

Now, we have a list of one IP address per visitor to the pages from which we’re collecting. At this point I like to ping each IP address to collect whatever information is available about it. This is why I remove the duplicate entries, which are caused by a visitor viewing more than one of the collecting pages, or by a visitor returning to the pages. I don’t need to waste time and bandwidth pinging the same IP address more than once. If I want to see which IPs visited which pages multiple times, I can always just look at filename.txt, filename2.txt and filename3.txt. I’ve named the ping routine PingList.vbs, and put it in C:\Folder\Subfolder. Here it is:
Option Explicit
On Error Resume Next

    Dim srcFile

    srcFile = "IPlist.txt"

    PingList(srcFile)

    If Err <> 0 Then
        Wscript.echo "Error Type = " & Err.Description
    End If

    WScript.Quit
'-----------------------------------------------------------------------------------------
Function PingList(srcFile)

    Dim objFSO
    Dim objShell
    Dim strCommand
    Dim opnFile
    Dim strText
    Dim logFile

    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objShell = Wscript.CreateObject("Wscript.Shell")

    logFile = "Log.txt"

    If objFSO.FileExists(srcFile) Then
        Set opnFile = objFSO.OpenTextFile(srcFile, 1)
        Do While Not opnFile.AtEndOfStream
        strText = opnFile.ReadLine
        If Trim(strText) <> "" Then
            strCommand = strText
            objShell.run "%comspec% /c ping -a -n 1 " & strText & " >> " & logFile, , True
        End If
        Loop
        opnFile.Close
    Else
        WScript.Echo "File '" & srcFile & "' was not found."
    End If

End Function

This script pings each IP address in IPlist.txt, resolves the hostname if possible and writes the results to Log.txt in the same directory. Go ahead and create a blank Log.txt now in C:\Folder\Subfolder.

The next thing you’ll want to do is clear the contents of the text files on the server, so that they will retain only the IP addresses from new visits. Create the following file in C:\Folder\Subfolder. I’ve named it Upload.cmd. It's critical that it has the .cmd file type.
@echo off
echo user YourUsername> ftpcmd.dat
echo YourPassword>> ftpcmd.dat
echo bin>> ftpcmd.dat
echo cd /YourWebDirectory/>> ftpcmd.dat
echo prompt>> ftpcmd.dat
echo mput %1 %2 %3>> ftpcmd.dat
echo rename filename_.txt filename.txt>> ftpcmd.dat
echo rename filename 2_.txt filename2.txt>> ftpcmd.dat
echo rename filename 3_.txt filename3.txt>> ftpcmd.dat
echo quit>> ftpcmd.dat
ftp -n -s:ftpcmd.dat ftp.YourWebsite.com
del ftpcmd.dat
exit

Substitute YourUsername and YourPassword with the username and password with which you access your website files, and YourWebDirectory with the location of your website files on the server. In C:\Folder\Subfolder, create the blank text file(s) that will overwrite the ones on the server. Give them a different name (for instance, add an underscore), as you’ll want to distinguish them from the files you downloaded at the beginning of this exercise. Hence, the blank filename_.txt will be copied to the server as filename.txt, overwriting the existing file. The number of per-cent-sign-plus-integer combinations (variables) needs to correspond with the number of files you upload and overwrite; in this case, three (%1 %2 %3 = filename_.txt, filename2_.txt and filename3_.txt). Substitute YourWebsite.com for the domain name of your website.

Before moving to the final step, create a blank text file named ErrorLog.txt in C:\Folder\Subfolder. This is where we’ll record any errors encountered during the execution of the combined routines.

Now to put it all together and automate it.   Create the following batch file (I’ve name it DLPingUL.bat) and put it in C:\.
@echo off
title Download Ping Upload - Scheduled task, please wait

cd "C:\Folder\Subfolder"
start "" /wait CScript DloadCmbnDdupe.vbs 2>> ErrorLog.txt
start "" /wait Upload.cmd filename_.txt filename2_.txt filename3_.txt 2>> ErrorLog.txt
copy /d /y /a IPlist.txt NewIPlist.txt /a 2>> ErrorLog.txt
start "" /wait CScript PingList.vbs 2>> ErrorLog.txt
del IPlist.txt
exit

The reason we’ve put this in C:\ is so that Windows Task Scheduler will have no problem with permissions when running it. This routine moves to the directory in which you’ve stored all of the relevant files (C:\Folder\Subfolder, substitute with the actual location); executes the Visual Basic script that creates the list; executes the upload, passing the file names of the blank replacement files to the echo mput %1 %2 %3 command; copies IPlist.txt to NewIPlist.txt, overwriting the latter if it exists (this is so you have a list to which to refer if you want); and executes the VB Script to ping and record the results. Finally, it deletes IPlist.txt, as it needs to be created programmatically each time. Any errors are recorded in C:\Folder\Subfolder\ErrorLog.txt.

The final step is to create a new task in Task Scheduler that runs C:\DLPingUL.bat at a time of your choosing. I run it every Saturday at 3:00 am, so that when I wake up I have C:\Folder\Subfolder\Log.txt waiting for me with all of its pinged IP address information. Having the hostname can be especially helpful; it can show you which bots and spiders crawled your site, or from which corporation the visit originated. The only manual task I do is any further research on those IP addresses, like running them through Whois or whatismyipaddress.com/ip-lookup . These sites help me to determine, for example, if an IP address is static or dynamic, or if it is associated with hackers or spammers, among other useful bits of information. When I’m finished, I clear the contents of Log.txt so that it is ready for next time.

If you’ve read all the way down to here then you’ve been very patient with me, and for that I thank you kindly.

Addendum: For best results, when copying the above code, click on "View Source" in the upper right corner of the code box, and copy and paste the source. This will ensure consistency.

_____________________________________________
Phillip Schubert is the founder of Schubert & Associates
www.schubertassociates.com.au

Learn how divergent branches can appear in your repository and how to better understand why they are called “branches".  Brought to you in partnership with Wakanda

Topics:

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}