$ perl -e '$/=undef;$t=<>;foreach(3815,3975,3871){$i=index($t,"ID=$_")
+;printf"ID=$_:%d(chunk=%d,offset=%d)\n",$i,int($i/1024),$i%1024}' Alp
+habeticalListing.asp
ID=3815:103836(chunk=101,offset=412)
ID=3975:104688(chunk=102,offset=240)
ID=3871:105271(chunk=102,offset=823)
Ok, perhaps I could have used some spaces on that one-liner, but I was having too much fun this way. Oddly, it seems that 3975 is the first match in its chunk, so I would have expected it to be 3871 that got missed.
You should try this as your loop:
while (1) {
my $buf;
my $n = $http->read_entity_body($buf, 1024);
die "read failed: $!" unless defined $n;
last unless $n;
push @listing, $buf =~ /ID=(\d+)/g;
}
Note that the problem still could exist where the literal string "ID=xxxx" crosses over the boundary - say "ID=3" at the end of one 1024-byte chunk, and "975" at the beginning of the next. It's probably easiest to slurp the whole thing in, and then do a single global match.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.