Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks, I wondered if someone could tell me why my very simple program is taking so long to exit after it has printed the results to the screen? Many thanks.
#! /usr/bin/perl -w use strict; my $num_of_params; my $dna1; my $base1; $num_of_params = @ARGV; if ($num_of_params < 2) { die ("\n you havent entered enough parameters! \n"); } open (INFILE, $ARGV[0]) or die "unable to open file"; my @lines = <INFILE>; chomp @lines; $dna1 = join ('', @lines); $dna1 =~ s/\s//g; $dna1 =~ s/^>genome//; @lines = split ('', $dna1); my $count_of_A = 0; my $count_of_C = 0; my $count_of_G = 0; my $count_of_T = 0; my $errors = 0; foreach $base1 (@lines) { if (($base1 eq 'A') || ($base1 eq 'a')) { ++$count_of_A; } elsif (($base1 eq 'G') || ($base1 eq 'g')) { ++$count_of_G; } elsif (($base1 eq 'C') || ($base1 eq 'c')) { ++$count_of_C; } elsif (($base1 eq 'T') || ($base1 eq 't')) { ++$count_of_T; } else { print "There are errors in this DNA sequence. Please check that + your sequence contains only A C T and G\n"; ++$errors; } } print "\nThere are $count_of_A A's in this sequence\n\n"; print "There are $count_of_G G's in this sequence\n\n"; print "There are $count_of_T T's in this sequence\n\n"; print "There are $count_of_C C's in this sequence\n\n"; print "There are $errors errors in this sequence\n\n"; exit;

Replies are listed 'Best First'.
Re: programs taking ages to exit
by gjb (Vicar) on Sep 26, 2003 at 13:13 UTC

    I suppose the number of lines in the file and the number of characters per line is rather high? If so, you're building a rather huge memory structure (very long list) in @lines, and it will take a while before memory deallocation is done and the program can exit.

    Since this seems to be a very simple counting operation, you could consider reading one line at the time from the file, and, if necessary, even iterate over individual characters in the line.

    Hope this helps, -gjb-

Re: programs taking ages to exit
by Not_a_Number (Prior) on Sep 26, 2003 at 18:16 UTC

    Try this, it should be faster. But I'm sure it could be improved further...

    open (INFILE, $ARGV[0]) or die "unable to open file: $!"; # It's a good idea to check the error message in Perl's special variab +le $! my ($a_count, $c_count, $g_count, $t_count, $err_count) = 0; while ( <INFILE> ) { s/^>genome//; s/\s//g; $a_count += tr /[Aa]//; $c_count += tr /[Cc]//; $g_count += tr /[Gg]//; $t_count += tr /[Tt]//; $err_count += tr /[ACGTacgt]//c; } print "As: $a_count\n", "Cs: $c_count\n", "Gs: $g_count\n", "Ts: $t_count\n", "Errors: $err_count\n";

    hth

    dave