$ perl -e '$/=undef;$t=<>;foreach(3815,3975,3871){$i=index($t,"ID=$_")
+;printf"ID=$_:%d(chunk=%d,offset=%d)\n",$i,int($i/1024),$i%1024}' Alp
+habeticalListing.asp
ID=3815:103836(chunk=101,offset=412)
ID=3975:104688(chunk=102,offset=240)
ID=3871:105271(chunk=102,offset=823)
Ok, perhaps I could have used some spaces on that one-liner, but I was having too much fun this way. Oddly, it seems that 3975 is the first match in its chunk, so I would have expected it to be 3871 that got missed.
You should try this as your loop:
while (1) {
my $buf;
my $n = $http->read_entity_body($buf, 1024);
die "read failed: $!" unless defined $n;
last unless $n;
push @listing, $buf =~ /ID=(\d+)/g;
}
Note that the problem still could exist where the literal string "ID=xxxx" crosses over the boundary - say "ID=3" at the end of one 1024-byte chunk, and "975" at the beginning of the next. It's probably easiest to slurp the whole thing in, and then do a single global match.