Hello haukex,
Sorry for the late reply, but I was for a few days not able to work on that one.
I've followed your advice, opened the file in raw mode. Before, I was using the module use Path::Tiny; and then path('<file.pdf>')->slurp_raw; to open the file in raw mode. I guess that's the same behaviour?
Anyway, I followed your advice and came up with the following test program:
use v5.010;
use strict;
use warnings;
# Reading in file in raw format.
local $/;
open F, "<:raw", "input.pdf" or die $!;
my $raw_content = <F>;
my $nr_of_cos_objects = 10;
my @counter = (1..$nr_of_cos_objects);
my $position = 0;
for my $number (@counter) {
my $result = $raw_content=~qr/^${number} 0 obj/aa;
if ($result) {
say "Object item [$number] ('${number} 0 obj') starts at posit
+ion [$-[0]]";
} else {
say "Object item [$number] ('${number} 0 obj') start position
+not found";
}
if ($result) {
say "Object item [$number] ('${number} 0 obj') ends at posit
+ion [$+[0]]\n";
} else {
say "Object item [$number] ('${number} 0 obj') end position
+not found\n";
}
}
say "End test program. Bye...";
Result: nothing is found at all when using /aa (or /a). This is the output:
Object item [1] ('1 0 obj') start position not found
Object item [1] ('1 0 obj') end position not found
Object item [2] ('2 0 obj') start position not found
Object item [2] ('2 0 obj') end position not found
Object item [3] ('3 0 obj') start position not found
Object item [3] ('3 0 obj') end position not found
Object item [4] ('4 0 obj') start position not found
Object item [4] ('4 0 obj') end position not found
Object item [5] ('5 0 obj') start position not found
Object item [5] ('5 0 obj') end position not found
Object item [6] ('6 0 obj') start position not found
Object item [6] ('6 0 obj') end position not found
Object item [7] ('7 0 obj') start position not found
Object item [7] ('7 0 obj') end position not found
Object item [8] ('8 0 obj') start position not found
Object item [8] ('8 0 obj') end position not found
Object item [9] ('9 0 obj') start position not found
Object item [9] ('9 0 obj') end position not found
Object item [10] ('10 0 obj') start position not found
Object item [10] ('10 0 obj') end position not found
The /m is apparently indispensable in this regex setup, so I also tried /ma and /maa combinations. When doing this, I get the results back, but incorrect (same results as my very initial attempts...).
Object item [1] ('1 0 obj') starts at position [19]
Object item [1] ('1 0 obj') ends at position [26]
Object item [2] ('2 0 obj') starts at position [235]
Object item [2] ('2 0 obj') ends at position [242]
Object item [3] ('3 0 obj') starts at position [344]
Object item [3] ('3 0 obj') ends at position [351]
Object item [4] ('4 0 obj') starts at position [667]
Object item [4] ('4 0 obj') ends at position [674]
Object item [5] ('5 0 obj') starts at position [2663]
Object item [5] ('5 0 obj') ends at position [2670]
Object item [6] ('6 0 obj') starts at position [3139]
Object item [6] ('6 0 obj') ends at position [3146]
Object item [7] ('7 0 obj') starts at position [3514]
Object item [7] ('7 0 obj') ends at position [3521]
Object item [8] ('8 0 obj') starts at position [3839]
Object item [8] ('8 0 obj') ends at position [3846]
Object item [9] ('9 0 obj') starts at position [5063]
Object item [9] ('9 0 obj') ends at position [5070]
Object item [10] ('10 0 obj') starts at position [5501]
Object item [10] ('10 0 obj') ends at position [5509]
|