in reply to Split file, first 30 lines only
you get answers to solve your problem, but hoping not confusing you, i propose another solution (in Perl there is always!).
You are not forced to read the entire web page (can be an expensive task for big number of pages).
Infact get from LWP::UserAgent get the whole content unless you instruct it to behaves differently. You can specify a content_cb ie a callback to invoke for every chunk the agent retrieve from the remote server.
This bypass your need to have the 30 lines logic applied for every whole page you get.
Look at the docs of LWP::UserAgent, at this post by master zentara and at the following working example to get an idea of what i mean
use strict; use warnings; use LWP::UserAgent; my @pages = ('http://www.perlmonks.org','http://perldoc.org'); my $ua = LWP::UserAgent->new; # the line count is global my $read_lines=1; foreach my $url (@pages){ my $response = $ua->get($url, ':content_cb'=>\&head_only); } sub head_only{ my ($data,$resp,$protocol) = @_; my @lines = split "\n", $data; foreach my $line (@lines){ if ($read_lines == 31){ # reset the line count $read_lines = 1; print +("=" x 70),"\n"; # die inside this callback interrupt the request, not the p +rogram!! # see LWP::UserAgent docs die; } else{ print "line $read_lines: $line\n" } $read_lines++; } }
HtH
L*
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Split web page, first 30 lines only -- :content_cb trick
by wrkrbeee (Scribe) on Feb 28, 2017 at 14:19 UTC | |
|
Re^2: Split web page, first 30 lines only -- :content_cb trick
by wrkrbeee (Scribe) on Feb 28, 2017 at 21:18 UTC | |
by Athanasius (Archbishop) on Mar 01, 2017 at 09:48 UTC | |
by wrkrbeee (Scribe) on Mar 01, 2017 at 14:41 UTC | |
by Discipulus (Canon) on Mar 01, 2017 at 11:51 UTC | |
by wrkrbeee (Scribe) on Mar 01, 2017 at 14:43 UTC |