Platinum Partner

Another Perl Crawler

Again found in my old source folder, it may not fully work.

This Perl script reads in the existing links from links.dat into the array @bigarray.  It then loops through the array reading in each link and appending the new links it finds to links.dat.  If the script were run in a loop it would add every single web address it can find to links.dat.


#!/usr/bin/perl 
use IO::Socket; 
use URI; 
 
open(LINKS, "<< links.dat"); 
@bigarray = (); 
while () { 
        chomp; 
        push(@bigarray, $_); 
} 
close(LINKS); 
 
foreach $uri (@bigarray) { 
        ($domain = URI->new($uri)->authority) =~ s/^www\.//i; 
        $socket = IO::Socket::INET->new(PeerAddr 
                                => $domain, 
                                PeerPort => 80, 
                                Proto => 'tcp', 
                                Type => SOCK_STREAM) 
        or die "Couldn't connect"; 
        print $socket "GET / HTTP/1.0\n\n"; 
        #$page = <$socket>; 
        open(LINKS, ">> links.dat"); 
        while (defined($line = <$socket>)) { 
                $line =~ m{href="(.*?)"}ig; 
                print LINKS "$1"; 
            } 
        close(LINKS); 
        close($socket); 
}
{{ tag }}, {{tag}},

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}
{{ parent.authors[0].realName || parent.author}}

{{ parent.authors[0].tagline || parent.tagline }}

{{ parent.views }} ViewsClicks
Tweet

{{parent.nComments}}