in reply to Re: Bioinformatics: Regex loop, no output
in thread Bioinformatics: Regex loop, no output

Something I don't understand here. The "uniquifying" action of the  %seen hash acts to prevent the passage of un-split, whole, original proteins (like  AAAAAA in the example below) from getting through the dataflow "pipe" into the output, but it also prevents duplicated split pieces (e.g.,  AAAAK AAAA below) from the input from being output. Is this bioinformatically useful?

c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my @proteins = qw(AAAAKAAAA AAAAKAAAA AAAAAA); ;; my %seen = map {$_ => 1} @proteins; ;; print qq{Peptide '$_'} for grep !$seen{$_}++, map {split /[KR]\K(?!P)/} @proteins; ;; dd \%seen; " Peptide 'AAAAK' Peptide 'AAAA' { AAAA => 2, AAAAAA => 2, AAAAK => 2, AAAAKAAAA => 1 }


Give a man a fish:  <%-{-{-{-<