in reply to Perl Multiline Regex

You need to keep track of previously viewed rows. Assuming the data is sorted by sequence id,

my $last_seq; while (<>) { my ($seq, $p1, $p2) = (split)[0, 2, 3]; ($p2, $p1) = ($p1, $p2) if $p2 < $p1; if (defined($last_seq)) { if ($seq eq $last_seq) { print(","); } else { print("\n$seq "); } } print("$p1..$p2"); $last_seq = $seq; } print("\n") if defined($last_seq);

Update: More like your code and less like your explanation:

sub extractseq { my ($seq, $ranges) = @_; system(extractseq => "-sequence=$seq.seq", "-auto", "-stdout", "-separate", "-reg=" . join(',', map { "$_->[0]..$_->[1]" } @$ranges), ) and die("system: $?/$!\n"); } my $last_seq; my @ranges; while (<>) { my ($seq, $p1, $p2) = (split)[0, 2, 3]; ($p2, $p1) = ($p1, $p2) if $p2 < $p1; if (defined($last_seq) && $seq ne $last_seq) { extractseq($last_seq, \@ranges); @ranges = (); } $last_seq = $seq; push @ranges, [ $p1, $p2 ]; } extractseq($last_seq, \@ranges) if defined($last_seq);

Update: Finally, if the input isn't sorted or if you prefer something simpler (at the cost of using more memory),

my %ranges_by_seq; while (<>) { my ($seq, $p1, $p2) = (split)[0, 2, 3]; ($p2, $p1) = ($p1, $p2) if $p2 < $p1; push @{ $ranges_by_seq{$seq} }, [ $p1, $p2 ]; } for my $seq (keys(%ranges_by_seq)) { my $ranges = $ranges_by_seq{$seq}; system(extractseq => "-sequence=$seq.seq", "-auto", "-stdout", "-separate", "-reg=" . join(',', map { "$_->[0]..$_->[1]" } @$ranges), ) and die("system: $?/$!\n"); }

Replies are listed 'Best First'.
Re^2: Perl Multiline Regex
by joomanji (Acolyte) on May 29, 2009 at 18:24 UTC
    Dear Ikegami,

    Thank you very much for your effort! I really did not expect some one to reply in such a short time and the solution actually worked! I just hit the F5 button and somebody just replied to my question and it was you! Before I could reply you came up with another update!

    I will definitely learn from the example you gave me and understand it thoroughly and applied on other script as well.I've modified the script and applied on other scripts!! Thank you! But the script you gave me was hang after the first input. Giving me the error message of "system: 0/Bad file descriptor". But when i commented out the line " or die("system: $?/$!\n"); it works just fine!

    Do you mind to explain this more?
      Oops, that should be "and die" instead of "or die". system is unusual in its return value. Fixed.
        Cool! Thanks! It works perfectly now! Thank you!