in reply to Re^2: Pattern matching simultaneous substitution
in thread Pattern matching simultaneous substitution

I still don't understand the rules. If the input is
DDssDD
what should it become? Are the "s" prior to "D" or following a "D"?

In the new/old example, why are the final "s" not replaced? Don't they follow a "D"?

Please, try to be more precise.

Also, you can easily shorten the data, 2 consecutive characters of each type would do.

map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

Replies are listed 'Best First'.
Re^4: Pattern matching simultaneous substitution
by Anonymous Monk on Jan 05, 2022 at 21:04 UTC
    Ok, let me try again with some examples of how data look now (old) and should look (new):
    -case 1: only 's'
    old:sssss new:iiiii

    -case 2: only 's' and 'U'
    old:sssUUss new:iiiMMoo

    -case 3: only 's' and 'D'
    old:sssDDss new:oooMMii

    -case 4: 's', 'D' and 'U' (all possible characters)
    old:sssssDDDDDDDssssUUUss new:oooooMMMMMMMiiiiMMMoo

    OR
    old:sssssUUUUUUUssssDDDss new:iiiiiMMMMMMMooooMMMii

    For some reason, the code I wrote changes many but not all cases... Not that you cannot see s+D+s+D+s+<code> sequence, or <code>s+U+s+U+s+<code>. <code>D and U alternate.
      > Not that you cannot see s+D+s+D+s+ sequence, or s+U+s+U+s+. D and U alternate.

      If you mean "Note" by "Not", than your sample input is invalid, as both the sequences contain sDsDs.

      Also, I think the following works correctly even for the non-alternating sequences in the way you originally showed:

      my %before = (U => 'I', D => 'O'); my %after; @after{ keys %before } = reverse values %before; sub subst { local ($_) = @_; 1 while s/(s+)(?=([DU]))/$before{$2} x length $1/e | s/(?<=([DU]))(s+)/$after{$1} x length $2/e; tr/DU/MM/ or tr/s/I/; return $_ }
      It uses the bitwise or that doesn't short circuit to try both the substitutions every time while there's anything to replace.

      Tested against:

      Might need some tweaking if the specification changes.

      map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
      Fixed it after all:
      while(<>) { if($_=~/^>/) { $id=$_; $seq=<>; print $id.$seq; if($seq=~/^s+$/) { $seq=~s/s/I/g; print $seq; } else { while($seq=~/(s+)([U|D]+)(s+)/g) { $part_before=$1; $len_bef=length($part_before); $part_TM=$2; $part_after=$3; $len_after=length($part_after); if($part_TM=~/U/) { $part_bef_new='I' x $len_bef; $part_after_new = 'O' x $len_after; } elsif($part_TM=~/D/) { $part_bef_new='O' x $len_bef; $part_after_new = 'I' x $len_after; } $seq=~s/$part_before/$part_bef_new/; $seq=~s/$part_after/$part_after_new/; } $seq=~s/U/M/g; $seq=~s/D/M/g; print $seq; } } }

      Thank you for your time!
        I don't think your code works. When I run it for input
        ssDDssUUss

        it returns

        OOMMIIMMss

        while you previously claimed the correct output is

        OOMMIIMMOO

        Similarly,

        ssUUssDDss

        is transformed into

        IIMMOOMMss

        while the correct output should be

        IIMMOOMMII

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]