DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone >

Extract The Body Of An HTML Document

Snippets Manager user avatar by
Snippets Manager
·
Jan. 04, 07 · · Code Snippet
Like (0)
Save
Tweet
460 Views

Join the DZone community and get the full member experience.

Join For Free
For example, print out just the body of Google's home page:


use LWP::UserAgent;
use HTML::TreeBuilder;

$ua = LWP::UserAgent->new;
my $req = HTTP::Request->new(GET => 'http://www.google.com/');
my $res = $ua->request($req);

if ($res->is_success) {
  my $tree = HTML::TreeBuilder->new_from_content($res->content);
  $tree->elementify();
  my $body = $tree->find('body');
  foreach $e ($body->content_list())
  {
    print $e->as_HTML();
  }
}

HTML Extract Document

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • How to Handle Early Startup Technical Debt (Or Just Avoid it Entirely)
  • Testing Schema Registry: Spring Boot and Apache Kafka With JSON Schema
  • Message Queuing and the Database: Solving the Dual Write Problem
  • Ultra-Fast Microservices in Java: When Microstream Meets Open Liberty

Comments

Partner Resources

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo