The posting appears to have been updated considerably... and the specification has shifted. Plus ça change :-(

OK. I give up. I really hope this is something useful and not just homework. The program below gives:

  >s1:  A  G  C  T  T  T  T  C  G  G  G  C  A  A  T
       23 43 45 65 76 54  3 34 54 65  7 45 56 87 56
    ->  A  G  C  T  T  T  C  G  G  C  A  A  T -- dropped 6 10
       23 43 45 65 76 54 34 54 65 45 56 87 56
  >s2:  G  C  T  G  C  C  C  C  C  C  A  T  C  T  T
       23 43 23 45 65 45 76 78 34  8 12 32 65 23 25
    ->  G  C  T  G  C  C  C  C  C  A  T  C  T  T -- dropped 9
       23 43 23 45 65 45 76 78 34 12 32 65 23 25
  >s3:  T  C  G  T  A  G  C  T  G  A  A  A  A  T  C
       12 23 34 45 56 54 43 32 65 43 12 34 75 76 45
    ->  T  C  G  T  A  G  C  T  G  A  A  A  A  T  C -- all kept
       12 23 34 45 56 54 43 32 65 43 12 34 75 76 45

#!/usr/bin/perl use strict; use warnings; my $seq_txt = <<FILE1; >s1 AGCTTTTCGGGCAAT >s2 GCTGCCCCCCATCTT >s3 TCGTAGCTGAAAATC FILE1 my $num_txt = <<FILE2; >s1 23 43 45 65 76 54 3 34 54 65 7 45 56 87 56 >s2 23 43 23 45 65 45 76 78 34 8 12 32 65 23 25 >s3 12 23 34 45 56 54 43 32 65 43 12 34 75 76 45 FILE2 open(my $FH1, '<', \$seq_txt) or die "$!" ; open(my $FH2, '<', \$num_txt) or die "$!" ; while (!eof($FH1)) { # Fetch sequence numbers and make sure we are in sync # # For tidyness, trim all leading and trailing whitespace -- which al +so takes care of # the newline. # # Skip blank lines in both files -- mostly so we're not foxed by tra +iling blankness. my $sid_1 = <$FH1> ; $sid_1 =~ s/^\s*// ; $sid_1 =~ s/\s*$// ; next if ($sid_1 eq '') ; # Comfortable with blank lines in file 1 my $sid_2 = '' ; while ($sid_2 eq '') { # Allow blank lines in file 2 also defined($sid_2 = <$FH2>) or die "file 2 -- hit eof expecting '$si +d_1'" ; $sid_2 =~ s/^\s*// ; $sid_2 =~ s/\s*$// ; } ; if ($sid_1 ne $sid_2) { die "File 1 sequence '$sid_1' ne file 2 '$sid_2'" ; } ; # Fetch the sequence -- discard all whitespace and check validity (m +ay not be empty) defined(my $seq = <$FH1>) or die "file 1 -- unexpected eof" ; $seq =~ s/\s+//g ; $seq =~ m/^[ACGT]+$/ or die "file 1 -- invalid sequence" ; my @seq = split(//, $seq) ; # Fetch the scores -- check valid form and length defined(my $num = <$FH2>) or die "file 2 -- unexpected eof" ; $num =~ s/^\s*// ; $num =~ s/\s*$// ; $num =~ m/^(?:\d+(?:\s+|\z))*$/ or die "file 2 -- invalid scores" ; my @num = split(/\s+/, $num) ; if (@num != @seq) { die "sequence $sid_1: sequence and scores don't +match" ; } ; # We pass everything except letters (and the related score) where th +e letter is: # # a. last of repeated sequence of 3 or more. # # b. its score is < 10 my @exam = map { ($num[$_] > 10) # must be OK if scor +e > 10 || ($_ < 2) # too early for repe +at of 3 or more ! || ($seq[$_] ne $seq[$_-1]) # not a repeat || ($seq[$_] ne $seq[$_-2]) # not a long enough +repeat || ( ($_ != $#seq) && ($seq[$_] eq $seq[$_+1]) ) # not the end of a r +epeat } (0..$#seq) ; my @keep = grep { $exam[$_] } (0..$#seq) ; my @drop = grep { !$exam[$_] } (0..$#seq) ; # Crunch printf "%6s%s\n", "$sid_1:", show_seq(@seq) ; printf "%6s%s\n", "" , show_num(@num) ; printf "%6s%s", "->", show_seq(@seq[@keep]) ; if (@drop) { print " -- dropped @drop\n" ; } else { print " -- all kept\n" ; } ; printf "%6s%s\n", "", show_num(@num[@keep]) ; $seq = join(' ', @seq[@keep]) ; $num = join(' ', @num[@keep]) ; # Now do whatever's required with the results .... } ; # Make sure that the two files have finished together. while (<$FH2>) { m/^\s*$/ or die "File 2 contains more than file 1" ; } ; #_____________________________________________________________________ +_ sub show_seq { return ' '.join(' ', @_) ; } ; sub show_num { return join('', map sprintf('%3d', $_), @_) ; } ;

In reply to Re: deleting a particular character and its coresponding score in 2 different files!!! by gone2015
in thread deleting a particular character and its coresponding score in 2 different files!!! by heidi

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.