in reply to Re: compare two text file line by line, how to optimise
in thread compare two text file line by line, how to optimise

Thank you for this code, that i don't understand :) I don't understand the utility of sub buildTestFile, please can you explain ? this code is really hard to understand for me. Wht is the utility of %words! that we don't use in any other part of the code, specialy when we use the grep Thank you
  • Comment on Re^2: compare two text file line by line, how to optimise

Replies are listed 'Best First'.
Re^3: compare two text file line by line, how to optimise
by hippo (Archbishop) on Feb 28, 2016 at 10:19 UTC
    I don't understand the utility of sub buildTestFile, please can you explain ?

    GrandFather has posted an SSCCE which is the best way to illustrate some situation in code. Rather than distribute countless MB of data as the input (which would have been rather impolite), the SSCCE builds them on the fly. This is what buildTestFile does.

    Wht is the utility of %words! that we don't use in any other part of the code

    Using the hash forces uniqueness as this is a property of hash keys.

      Hi But i don't want to test! i have data, and i search for true result and not an random output! when i eliminate the buildTestFile and just use the rest of the code, it take days to treat 50 Mb of data, not what specified in 3 minute. This code is also slow like all the other with my 2GB RAM computer :( Regards

        Run this simple program with minimal processing against your data and post the results. This will help eliminate one potential source of your problem (i/o) and provide a better indication of your data than just a size of 50M

        #!/usr/bin/perl use strict; my $t0 = time; my $file1 = $ARGV[0] || 'ficc.txt'; my $file2 = $ARGV[1] || 'fic.txt'; my $count1=0; my $words1=0; open FICC,'<',$file1 or die "$file1 : $!"; while (<FICC>) { my @words = split /\s+/,lc $_; $words1 += @words; ++$count1; } close FICC; my $count2=0; my $words2=0; open FIC,'<',$file2 or die "$file2 : $!"; while (<FIC>) { my @words = split /\s+/,lc $_; $words2 += @words; ++$count2; } close FICC; my $dur = int time-$t0; print " File1 : $count1 lines $words1 words in $file1 File2 : $count2 lines $words2 words in $file2 Time : $dur seconds\n";
        poj