in reply to matching characters and numbers with regex
because instead of searching in hexadecimal bytes, its searching characters in a string. in hexadecimal 0a is 1 character, but in a string 0a is 2 characters correct?$string =~/\w{4,8,16}/\/[0-9a-fA-F]/;
|
|---|
| Replies are listed 'Best First'. | |||
|---|---|---|---|
|
Re^2: matching characters and numbers with regex
by Athanasius (Archbishop) on May 31, 2014 at 07:45 UTC | |||
$string =~/\w{4,8,16}/\/[0-9a-fA-F]/; There are two errors here: Now for the bigger picture. You can probably do what you want with regexes, but it quickly becomes complicated. Here is some code I came up with to identify repeated 4-character sequences:
Output:
What concerns me here is the alignment problem: you presumably do not want to match a non-aligned sequence like the following:
See, for example, the discussion of the \G anchor in the “Global matching” section of perlretut#Using-regular-expressions-in-Perl. I’m not sure that regexes are the best tool for this job. I would look at converting your string into an array of integers, then building a hash of integer sequences (of the desired lengths) mapped to their number of occurrences in the original string. Hope that helps, Update (June 1): Corrected alignment example.
| [reply] [d/l] [select] | ||
by james28909 (Deacon) on May 31, 2014 at 22:52 UTC | |||
"4428FBABCBED062405E56F853AAE238C4428FBABCBED062405E56F853AAE238CCC9AA594B5B35063A28224E2FE347EE349E9FFEDB897E32725F42C0D9FA2400D56C78EC7E711F47AA032CB76E11996D4" Then i want to make sure it doesnt have any repeating characters that are 4,8, and 16 characters long. So if this above string was: "0A0AFBABCBED062405E56F853AAE238C4428FBABCBED062405E56F853AAE238CCC9AA594B5B35063A28224E2FE347EE349E9FFEDB897E32725F42C0D9FA2400D56C78EC7E711F47AA032CB76E11996D4" Difference in these two string are the Repeating characters 0A0A at the beginning of the string. if it finds repeating characters then it will terminate the program and not continue because its checking for corruptness. | [reply] [d/l] [select] | ||
by Athanasius (Archbishop) on Jun 01, 2014 at 03:26 UTC | |||
Well, this is a completely different spec from the one given previously (as I understood it, anyway)! If this is really all you need, it’s as simple as:
Output:
(Note that the final string tested here contains the repeated characters 0A0A, but these are not at the beginning of the string.) Two obvious questions:
I’ve got a sneaking suspicion that this thread is dealing with an XY Problem. If the answers don’t solve your real problem, you will need to explain the nature of the files and the process(es) by which the corruption may occur. Update: More compact version:
(In the actual script, the printf would be replaced by a die statement.) Hope that helps,
| [reply] [d/l] [select] | ||