in reply to Re^3: matching characters and numbers with regex
in thread matching characters and numbers with regex

james28909,

Well, this is a completely different spec from the one given previously (as I understood it, anyway)! If this is really all you need, it’s as simple as:

#! perl use strict; use warnings; while (<DATA>) { if (/^([0-9a-fA-F]{2})\1/) { print "Found 4 repeating characters: $1$1\n"; } elsif (/^([0-9a-fA-F]{4})\1/) { print "Found 8 repeating characters: $1$1\n"; } elsif (/^([0-9a-fA-F]{8})\1/) { print "Found 16 repeating characters: $1$1\n"; } else { print "Found 0 repeating characters\n"; } } __DATA__ 1234FBABCBED062405E56F853AAE238C4428FBABCBED0624 0A0AFBABCBED062405E56F853AAE238C4428FBABCBED0624 0A1B0A1BCBED062405E56F853AAE238C4428FBABCBED0624 0A1B2C3D0A1B2C3DCBED062405E56F853AAE238C4428FBAB 01230A0AFBABCBED062405E56F853AAE238C4428FBABCBED

Output:

13:12 >perl 914_SoPW.pl Found 0 repeating characters Found 4 repeating characters: 0A0A Found 8 repeating characters: 0A1B0A1B Found 16 repeating characters: 0A1B2C3D0A1B2C3D Found 0 repeating characters 13:12 >

(Note that the final string tested here contains the repeated characters 0A0A, but these are not at the beginning of the string.)

Two obvious questions:

  1. Why shouldn’t a legitimate (i.e., non-corrupt) file begin with repeated characters?
  2. If a file is “corrupted,” will this always manifest as repeated characters at the start of the file? If not, how will you test for other forms of file corruption?

I’ve got a sneaking suspicion that this thread is dealing with an XY Problem. If the answers don’t solve your real problem, you will need to explain the nature of the files and the process(es) by which the corruption may occur.

Update: More compact version:

while (my $string = <DATA>) { for my $chars (2, 4, 8) { printf "Found %2d repeating characters: %s\n", $chars * 2, $1 +. $1 if $string =~ /^([0-9a-fA-F]{$chars})\1/; } }

(In the actual script, the printf would be replaced by a die statement.)

Hope that helps,

Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,