in reply to substring selection from a string on certain qualifying conditions

What should the following return?
AXXAXXAXXXXXXXXXXXXXXAXXA

"AXXAXXA" and "AXXA", or just "AXXAXXA"?

  • Comment on Re: substring selection from a string on certain qualifying conditions
  • Download Code

Replies are listed 'Best First'.
Re^2: substring selection from a string on certain qualifying conditions
by BhariD (Sexton) on Dec 08, 2010 at 20:39 UTC

    If this is the input string: AXXAXXAXXXXXXXXXXXXXXAXXA

    Output should be:

    AXXAXXA and

    AXXA (AXXA from the end of the string)

      Any more unstated rules? :)

      C:\test>876075 AGRTGAXWXX : [ AGRTGA ] ACRMGAHKMAHGTXX : [ ACRMGAHKMA, GAHKMAHGT ] AXXAXXAXXXXXXXXXXXXXXAXXA : [ AXXAXXA, AXXA ]
      #! perl -slw use strict; use Data::Dump qw[ pp ]; sub maxMatches { my $s = shift; my @matches; my $vec = ''; for my $o ( 0 .. length( $s ) - 10 ) { my( $match ) = $s =~ m[.{$o}([ACGT].{0,8}[ACGT])] or next; my $mask = ''; vec( $mask , $_, 1 ) = 1 for $-[1] .. $+[1]-1; next if ( $vec | $mask ) eq $vec; $vec |= $mask; push @matches, $match; } return @matches; } while( <DATA> ) { chomp; printf "$_ : [ %s ]\n", join ', ', maxMatches( $_ ); } __DATA__ AGRTGAXWXX ACRMGAHKMAHGTXX AXXAXXAXXXXXXXXXXXXXXAXXA

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Your addition of unrequested functionality — removal of sequences found in other sequences in the string — is what prompted my question. There was no unstated rule. Your implication that the OP did something was uncalled for and wrong.