Re: RegEx misbehaving?

You're reading the file twice in your loop, first in the condition then again before the end of the loop, is this intentional? Also a few other suggestions - your match regex ^R.* can drop the .* as it achieves nothing because it says match 'R' at the beginning of the string, optionally followed by 0 or more of anything, in fact it would be better suited with a simple index. You could also drop the indexing using $i and replace it with the push function e.g

while ( $line = <TR_INFILE> ) {
    if ( index( $line, 'R' ) == 0 ) {
        ## removed /g as it's unnecessary
        $line =~ s/^Results://;
        
        print OUTFILE3 "$line";
        
        ## the ' ' is special, see. perldoc -f split
        my @chunks = split ' ', $line;
        push @trstart, shift @chunks;
        push @trend,   shift @chunks;
        push @period,  shift @chunks;
        push @copy,    shift @chunks;
        push @consize, shift @chunks;
        push @matches, shift @chunks;
        push @indels,  shift @chunks;
        push @score,   shift @chunks;
        push @numa,    shift @chunks;
        push @numc,    shift @chunks;
        push @numg,    shift @chunks;
        push @numt,    shift @chunks;
        push @entropy, shift @chunks;
        $TRID++;
    } elsif ( index($line, 'S') == 0 ) {
        $line =~ s/^Sequence:\s*//;
        push @TR_Accession, $line;
        $SEQID++;
    }   
}
[download]

Some of that code massaging is style but it will also be much faster than your current code as most of the fiddly stuff is now done by perl and it also saves a lot of hard-coding. See. push, index, shift for more info on the functions used above.
HTH

_________ broquaint

Comment on Re: RegEx misbehaving? Download Code

Replies are listed 'Best First'.
Re: Re: RegEx misbehaving? by Bilbo (Pilgrim) on Jul 18, 2003 at 10:45 UTC
Dropping the indexing using $i and replacing it with push wouldn't be exactly the same because at the moment $i is incremented on every line of the file, rather than just on those which begin with 'R'. In the original version the ith element of each array contained information about the ith line of the file, whereas using push would mean that ith element of each array contained information about the ith line to begin with 'R'.	[reply]

Replies are listed 'Best First'.

Re: Re: RegEx misbehaving?
by Bilbo (Pilgrim) on Jul 18, 2003 at 10:45 UTC

Dropping the indexing using $i and replacing it with push wouldn't be exactly the same because at the moment $i is incremented on every line of the file, rather than just on those which begin with 'R'. In the original version the ith element of each array contained information about the ith line of the file, whereas using push would mean that ith element of each array contained information about the ith line to begin with 'R'.

[reply]