$string =~/\w{4,8,16}/\/[0-9a-fA-F]/;

There are two errors here:

  1. The quantifier syntax X{y,z} means: at least y and no more than z occurrences of X. You want to say: either exactly 4 occurrences, or exactly 8 occurrences, or exactly 16 occurrences; but you can’t do that with this quantifier. See “Quantifiers” in perlre#Regular-Expressions.
  2. The construct /.../\/.../ is a syntax error: the regex ends at the second /.

Now for the bigger picture.

You can probably do what you want with regexes, but it quickly becomes complicated. Here is some code I came up with to identify repeated 4-character sequences:

#! perl use strict; use warnings; use List::MoreUtils 'uniq'; my $string = '0a0a0a0a0b0b0a0a0c0c0c0c0c0c0c0c' . '1f1f2b2b2b2b3e3e7b7b7b7b7b7b7b7b' . '8f8f8f8f8f8f8f8f6c6c4b4b4b4b3f3f' . '9d9d0f0f0f0f0f0f0f0f3a3a2e2e2e2e'; my @seqs = $string =~ /(([0-9a-fA-F]{2})\2)/g; @seqs = uniq grep { length == 4 } @seqs; for my $seq (@seqs) { my $matches = () = $string =~ /$seq/g; printf "%s: %d\n", $seq, $matches; }

Output:

17:30 >perl 914_SoPW.pl 0a0a: 3 0b0b: 1 0c0c: 4 1f1f: 1 2b2b: 2 3e3e: 1 7b7b: 4 8f8f: 4 6c6c: 1 4b4b: 2 3f3f: 1 9d9d: 1 0f0f: 4 3a3a: 1 2e2e: 2 17:30 >

What concerns me here is the alignment problem: you presumably do not want to match a non-aligned sequence like the following:

0a0axxx0a0a0yyyy ^^^^ ^^^^

See, for example, the discussion of the \G anchor in the “Global matching” section of perlretut#Using-regular-expressions-in-Perl.

I’m not sure that regexes are the best tool for this job. I would look at converting your string into an array of integers, then building a hash of integer sequences (of the desired lengths) mapped to their number of occurrences in the original string.

Hope that helps,

Update (June 1): Corrected alignment example.

Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,


In reply to Re^2: matching characters and numbers with regex by Athanasius
in thread matching characters and numbers with regex by james28909

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.