Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi! I have written a few lines of code that basically wrap two sequences and their mismatches so they print 60 chars per line. However, this only works if the two sequences are identical length. How can I alter my code to wrap two different length sequences and just ignore the difference between the two??? thanks ;-)
CATGACTTCTAATCGC..... **** ****** ** CCATTCTGACTGACTT ACTG * * TCAG my @strings = ($genome1, $mismatches, $genome2); for (my $x = 0; $x < length ($strings [0]); $x+= 60) { join, map (substr ($strings[$_], $x, 60), 0..2); }

Title edit by tye (please don't use one-word titles)

Replies are listed 'Best First'.
Re: Wrapping Strings
by tadman (Prior) on Jan 17, 2003 at 10:37 UTC
    You'll note that you're only testing against the length of your 0th string, and not the longest one. What you need to do is figure out which is the longest, and test against that.
    my $line_length = 60; my @strings = ($genome1, $mismatches, $genome2); # Determine the longest string of the group my $length = (reverse sort { $a <=> $b } map { length } @strings)[0]; # Then iterate through for (my $x = 0; $x < $length; $x += $line_length) { print substr ($strings[$_], $x, $line_length),$/ for (0..2); }
    It's not clear what your join/map combination was doing.
      thanks for the help! I am writing this to do a cgi and am having problems embedding the html into your code.
      print STDOUT join ("font face=\"courier\"><p>" substr ($strings[$_], $ +x, $line_length), $/ for (0..2)) "</font><p>";
      Can you see where my syntax is bad?? thanks again
        hi

        If you are doing cgi then I suggest that you check out Ovid's "Web Programming Using Perl" Course which is a pretty good intro to CGI programming.

        Might I also point you to Bioperl which is a bioinformatics perl site where lots of common sequence manipulations etc. have already been solved.

        A.A.

        You might not have noticed, but the print command can take a list of things to print, which means that join() is really not required. The syntax error is that after your first string where you establish the font, you don't use the period to concatenate, or a comma to make a proper list.
        foreach (0..2) { print '<font face="courier"><p>', substr($strings[$_], $x, $line_length), '</p></font>', $/; }
        I moved the for-loop into a more obvious location. Before you were losing it in the print statement. I've also take then liberty of opening your <FONT> tag properly, as well as using single quotes to avoid having to escape the quotation marks in your HTML.

        As another note, print sends to STDOUT by default, so there's no real need to specify that.
Re: wrapping
by demerphq (Chancellor) on Jan 17, 2003 at 15:38 UTC
    Your approach is pretty C like. :-)

    Heres two routines to do what you want. The first is task specific (itll only work on three lines). The second is generic and will work on any number of lines. They both work on the same principal. That is by choping off at most $max characters from the front of the string (actually a copy so the originals dont get toasted) and then printing them.

    I got a touch carried away :-) and wrote the header printer as well. Also my Perl is fairly idiomatic. I hope its not too confusing. If it is then ask and I'll explain in detail.

    Thanks by the way, this was fun.

    use strict; use warnings; sub print_header { my ( $max, $indent ) = @_; $indent = " " x $indent; my $maxlen=length $max; # Create a header so they can see what column things are on my @top; foreach my $num ( 1 .. $max ) { my $r = 0; foreach my $char ( split //, sprintf( "% ${maxlen}s", $num ) ) + { $top[ $r++ ] .= $char; } } # print the header print $indent. join ( "\n" . $indent, @top ), "\n"; # print a divider line print $indent. ( "-" x $max ) . "\n"; } sub split_lines { my ( $max, $x, $y, $z ) = @_; # $max is the maximum length of a line # $x,$y,$z are the lines to be split and displayed print_header( $max, 2 ); # two because of the "> " from below while ( length($x) || length($y) || length($z) ) { # while any of our strings have content left within # cut out the first $max chars and print them # with a blank line seperating the groups s/^(.{0,$max})//s and print "> $1\n" for $x, $y, $z; print "\n"; } } sub generic { my $max = shift; my @lines = @_; my $indent = length(@lines); print_header( $max, $indent + 2 ); # +2 because of the "> " below PRINT: { # while any of our strings have content left within # cut out the first $max chars and print them # with a blank line seperating the groups $lines[$_] =~ s/^(.{0,$max})//s and printf "% ${indent}s> %s\n", $_ + 1, $1 for 0 .. $#lines; print "\n"; length($_) and redo PRINT for @lines; } } my $x = "CATGACTTCTAATCGCACTG"; my $y = " **** ****** *** *"; my $z = "CCATTCTGACTGACTTTCAG"; print "The pretty task specific version:\n"; split_lines( 15, $x, $y, $z ); print "The generic version:\n"; generic( 15, $x, $y, $z, $x, $y, $z, $x, $y, $z, $x, $y, $z ); __END__ The pretty task specific version: 111111 123456789012345 --------------- > CATGACTTCTAATCG > **** ****** * > CCATTCTGACTGACT > CACTG > ** * > TTCAG The generic version: 111111 123456789012345 --------------- 1> CATGACTTCTAATCG 2> **** ****** * 3> CCATTCTGACTGACT 4> CATGACTTCTAATCG 5> **** ****** * 6> CCATTCTGACTGACT 7> CATGACTTCTAATCG 8> **** ****** * 9> CCATTCTGACTGACT 10> CATGACTTCTAATCG 11> **** ****** * 12> CCATTCTGACTGACT 1> CACTG 2> ** * 3> TTCAG 4> CACTG 5> ** * 6> TTCAG 7> CACTG 8> ** * 9> TTCAG 10> CACTG 11> ** * 12> TTCAG
    HTH.

    --- demerphq
    my friends call me, usually because I'm late....

Re: wrapping
by scain (Curate) on Jan 17, 2003 at 16:11 UTC