in reply to Better solution to the code

Hi,

The poor performance comes from the fact that you are opening and parsing the same (big) file many times.

You would be better off reversing your strategy and opening the file, parsing it and comparing each line with the contents of your array @tag.

You should also, if possible, consider loading your data into a hash instead of an array. If you do that, you will profit from exists.
# your data is in %tag open (IN, "<Input_file.dat") or die "Cannot read $!\n"; open (OUT,"+>Result_file.txt") or die "Cannot create file $!\n"; while (<IN>) { print OUT $_ if exists $tag{$_}; }

Lu.

Replies are listed 'Best First'.
Re^2: Better solution to the code
by moritz (Cardinal) on Jan 25, 2008 at 10:35 UTC
    The idea with the hash won't work, because the regex match searches for a matching substring, the hash lookup compares the whole string.

    But that reminds me of another possible optimization: if @tag doesn't contain regexes but only constant substrings, index might speed up things.

    So instead of if ($_ =~ m/$something/){ ... }, you can write if (0 <= index $_, $something).

Re^2: Better solution to the code
by cdarke (Prior) on Jan 25, 2008 at 12:56 UTC
    BTW, to put @tags into %tags use:
    my %tags; @tags{@tags} = undef;
    Yes, it's confusing calling a hash and and an array the same thing.