Extract The Body Of An HTML Document
Join the DZone community and get the full member experience.
Join For FreeFor example, print out just the body of Google's home page:
use LWP::UserAgent;
use HTML::TreeBuilder;
$ua = LWP::UserAgent->new;
my $req = HTTP::Request->new(GET => 'http://www.google.com/');
my $res = $ua->request($req);
if ($res->is_success) {
my $tree = HTML::TreeBuilder->new_from_content($res->content);
$tree->elementify();
my $body = $tree->find('body');
foreach $e ($body->content_list())
{
print $e->as_HTML();
}
}
HTML
Extract
Document
Opinions expressed by DZone contributors are their own.
Comments