Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I need to right a regular expression that will find a X length string at the begining and a string of the same lenght at the end of a pattern, (also similar circumstances in the midle) I know the length of the patterns at the end but not the middle, and I know how they relate to each other, I also know that there can be one miss (IE if there are 4 characters only 3 have to follow the pattern) I was thinking of something like /(.{4})(\1 =~ tr/something/somethingelse/)/) but that doesnt work, or running a subroutine on \1 or $1 like /(.{4})(trans (\1))/ or $1, please help also there can be one miss on each group and it still has to find it, is it even possible to do this?
  • Comment on subroutine within a regular expression and allowing for a miss

Replies are listed 'Best First'.
Re: subroutine within a regular expression and allowing for a miss
by moodster (Hermit) on May 06, 2002 at 08:15 UTC
    I'm not entirely sure what you are aiming at here. If you want to match string which has the same 4-char substring X at both beginning and end, this should do the trick:
    /^(.{4}).*\1$/ # ^ and $ match beginning and end of string respect +ively
    However, it's not clear what you mean by saying you want to match "similar circumstances in the middle". If that means that the string is repeated somewhere in the middle you could use:
    /^(.{4}).*\1.*\1$/
    As for the rest, I'd been much easier if you had posted sample input data so we'd know what you meant by allowing for a miss. I'll assume that you have a string of four characters and want three or more of the characters to match and that length and order matters. Like this:
    ABCD # pattern ABCD # match ACDB # no match ABED # match BBCD # match ABC # no match
    The most obvious way to do it I could think of is to use a regexp like this: /(.BCD|A.BCD|AB.D|ABC.)/. The following sub compiles this pattern for you:
    sub makepattern { my $str = shift; my @res; for( $i = 0; $i < length( $str ); $i++ ) { push @res, substr( $str, 0, $i ) . "." . substr( $str, $i + 1, l +ength( $str ) - $i ); } return join( "|", @res ); } # example my $imprecisematch = "ABCD"; my $pattern = makepattern( $imprecisematch ); /^($pattern).*\1.*\1$/ #match $_ against three occurrences of this + pattern

    Cheers,
    --Moodster

Re: subroutine within a regular expression and allowing for a miss
by Sidhekin (Priest) on May 06, 2002 at 09:32 UTC

    I am also quite confused by your problem description and would like some example data, but since I need to kill a little time, I will try another reading. This may or may not be what you want:

    /((.)(.)(.)(.)).*(\2\3\4.|\2\3.\5|\2.\4\5|.\3\4\5)/; # You want either $& or $1/$6, I guess: # print "($&)($1)($6)";

    The Sidhekin
    print "Just another Perl ${\(trickster and hacker)},"