Over a million developers have joined DZone.

Extract The Body Of An HTML Document

·
For example, print out just the body of Google's home page:


use LWP::UserAgent;
use HTML::TreeBuilder;

$ua = LWP::UserAgent->new;
my $req = HTTP::Request->new(GET => 'http://www.google.com/');
my $res = $ua->request($req);

if ($res->is_success) {
  my $tree = HTML::TreeBuilder->new_from_content($res->content);
  $tree->elementify();
  my $body = $tree->find('body');
  foreach $e ($body->content_list())
  {
    print $e->as_HTML();
  }
}

Topics:

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}