in reply to Re: Regex to match range of characters broken by dashes
in thread Regex to match range of characters broken by dashes

A fly in the ointment was in the code for our $s (@stops). The code wouldn't work here with a 'my' declaration. 'our' was necessary.

This is a bug that was corrected in Perl version 5.18 IIRC. With this correction, lexical variables always work as expected in  "(?{ code })" and  "(??{ code })" regex constructs.

The dynamic regex form was necessary because the count of the quantifier changed for each iteration of the 'for' loop ($s-1).

I don't see the necessity here. Except for the fact that aliasing into the  @stop array makes calculating the quantifier a bit tedious, it can all be written normally, given that the  s/// match regex is, by default, re-compiled on each  s/// execution:

c:\@Work\Perl\monks>perl -wMstrict -le "my @stops = (2,6); ;; my $tag = '___'; ;; for ('ATCGGATCTGGC', 'A-C-G--CTGGC') { my $seq = $_; printf qq{'$seq' -> }; ;; for our $s (@stops) { local our $q = $s - 1; $seq =~ s/ ((?:[TAGC][^TAGC]*){$q} [TAGC]) /$1$tag/x; } print qq{'$seq'}; } " 'ATCGGATCTGGC' -> 'AT___CGGA___TCTGGC' 'A-C-G--CTGGC' -> 'A-C___-G--CTG___GC'
And except for say, it works under Perl version 5.8.9. See also Re: Regex to match range of characters broken by dashes Update 2 for another for-loop example.

Update: I've based my code example on your original code, prior to adding the second  s/// fixup.


Give a man a fish:  <%-{-{-{-<

Replies are listed 'Best First'.
Re^3: Regex to match range of characters broken by dashes
by Cristoforo (Curate) on Jul 18, 2016 at 01:26 UTC
    Thanks for pointing out points in my solution that can be stated cleaner.

    This is a bug that was corrected in Perl version 5.18 IIRC. With this correction, lexical variables always work as expected in "(?{ code })" and "(??{ code })" regex constructs.

    I wasn't aware of that bug. And your local our $q = $s - 1; fixes that.

    it can all be written normally, given that the s/// match regex is, by default, re-compiled on each s/// execution

    That is a nice solution! The (??{. . .}) construct wasn't necessary.

      ... local our $q = $s - 1; fixes [the lexical bug].

      It can even be fixed a bit more cleanly, and also fold in the added  s/// fixup at the end (still runs under 5.8.9):

      c:\@Work\Perl\monks\Q.and>perl -wMstrict -le "my @stops = (2,6); ;; my $tag = '___'; ;; for (qw(ATCGGATCTGGC A-C-G--CTGGC)) { my $seq = $_; printf qq{'$seq' -> }; $seq =~ s{ ((?: [TAGC] [^TAGC]*){$_} [TAGC]) [^TACG]* }{$1$tag}xms for map $_-1, @stops; print qq{'$seq'}; } " 'ATCGGATCTGGC' -> 'AT___CGGA___TCTGGC' 'A-C-G--CTGGC' -> 'A-C___G--CTG___GC'


      Give a man a fish:  <%-{-{-{-<