in reply to Re^2: substring selection from a string on certain qualifying conditions
in thread substring selection from a string on certain qualifying conditions

Any more unstated rules? :)

C:\test>876075 AGRTGAXWXX : [ AGRTGA ] ACRMGAHKMAHGTXX : [ ACRMGAHKMA, GAHKMAHGT ] AXXAXXAXXXXXXXXXXXXXXAXXA : [ AXXAXXA, AXXA ]
#! perl -slw use strict; use Data::Dump qw[ pp ]; sub maxMatches { my $s = shift; my @matches; my $vec = ''; for my $o ( 0 .. length( $s ) - 10 ) { my( $match ) = $s =~ m[.{$o}([ACGT].{0,8}[ACGT])] or next; my $mask = ''; vec( $mask , $_, 1 ) = 1 for $-[1] .. $+[1]-1; next if ( $vec | $mask ) eq $vec; $vec |= $mask; push @matches, $match; } return @matches; } while( <DATA> ) { chomp; printf "$_ : [ %s ]\n", join ', ', maxMatches( $_ ); } __DATA__ AGRTGAXWXX ACRMGAHKMAHGTXX AXXAXXAXXXXXXXXXXXXXXAXXA

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^4: substring selection from a string on certain qualifying conditions
by ikegami (Patriarch) on Dec 09, 2010 at 04:45 UTC

    Your addition of unrequested functionality — removal of sequences found in other sequences in the string — is what prompted my question. There was no unstated rule. Your implication that the OP did something was uncalled for and wrong.

      Update: The post above, in its entirety, originally read:

      Your addition of unrequested functionality is what prompted my question. There was no unstated rule.

      And was silently modified after the fact, without notification, in typically unhanded, duplicitous, and utterly dishonourable fashion. Presumably an attempt to try and save face.


      Sorry, but the OPs own code would remove all duplicates sequences found, regardless of where they were found.

      my %uniq=(); my $string = 'ACRMGAHKMAHGTXX'; substr($string, $_, 10 ) =~ m[([AGTC].{0,8}[AGTC])] and ++$uniq{ $1 } for 0 .. length( $string )-1; for my $key (keys %uniq){ print $key, "\n"; }

      In the absence of any specific discussion, the OPs code is the spec. You opened that discussion, and I up-voted you for doing so, but there is no mention of that requirement in the OPs post. Neither in the stated "conditions", nor the worked examples.

      A requirement, not discussed is "unstated".


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Sorry, but the OPs own code would remove all duplicates sequences found, regardless of where they were found.

        No it doesn't. Output:

        AXXA AXXAXXA