Hmmm.
In looking more closely at your regex, it seems like you are replacing a sequence of 3072 characters with a sequence of 1984 characters. Thus if there is one replacement in $block, the statement
$final = pack("B*", substr($block,0,$blocksz)); will include 1088 unchecked characters from $block02. That would explain why it appears to be adding whats left at the end of the boundary. You may have to keep track of the number of substitutions performed and then calculate how many characters you need to include in the pack statement. Maybe something like ...
my $nrrepl = $block =~ s/11110100.{8}(.{1520})11110100.{8}(.{464}
+).{1056}/$1$2/g;
my $outblocksz = $blocksz - ($nrrepl * 1088);
$final = pack("B*", substr($block,0,$outblocksz)); # this should
+work
You might then have to be sure that $outblocksize is a multiple of 8. It probably will be given the patterns you are working on.
There are a couple of implicit assumptions in the code that we might examine. Is the data you are working with byte aligned and of even size? That is, is the data comprised of 32 bit integers? or does the data vary say, a 4 byte integer, followed by a 7 byte string etc? Since you are packing with 'B*' you could be introducing additional bits at the literal byte (8bit) boundary. If the data is evenly spaced, you could set BLOCKSZ to the size of your regex, that might keep everthing aligned properly.
Another possibility is that when you change a sequence across the boundary between blocks 01 and 02, you introduce a sequence in block02. Your sequence is rather long and involved though so I rather assumed that wouldn't happen but I guess you should consider this as a fringe case.
PJ
use strict; use warnings; use diagnostics;
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
|
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.