in reply to Re^9: compare two text file line by line, how to optimise
in thread compare two text file line by line, how to optimise

Thank you very much, this is really fast.

Can you please explain what are the major change that did you apply and the error that i must no do, to speed up my code like you?

Hope i can modify it later to have the output like i specified in the begening of the post: specially the number of the line that correspond to the intersecion

If i have problems il will let you now

You were very kind , thank you very mcuh

Regards

  • Comment on Re^10: compare two text file line by line, how to optimise

Replies are listed 'Best First'.
Re^11: compare two text file line by line, how to optimise
by poj (Abbot) on Feb 28, 2016 at 16:55 UTC

    Looks like your code had 2 loops, each counting to +6 million

    foreach my $che (@b){ @aa=split(/\s/,$che); foreach my $kh (@a){ @bb=split(/\s/,$kh); for ($l=0;$l<=$#bb;$l++){ for ($m=0;$m<=$#aa;$m++){ ## this code executes 6 million x 6 million times if(($bb[$l] eq $aa[$m]) ){ .. } } } }

    But within the 6 million words, there are only few thousand different ones so your loops were checking the same word thousand of times more than required. By holding the unique words from file1 in a hash you don't have to loop through 6 million words every time to find a match with a word from file2

    poj

      hi i don't understand this part of code

       my @match = grep $uniq1{$_}, @words; what i understand : here we search for $uniq1{$_} in @words

      we know that @words contain the current line, but $uniq1{$_} what does it contain? and if $uniq1{$_} contain the line of the file N°1 how to browse(iterate) it and how to change from a value to another

        while (<FICC>) { my @words = split /\s+/,lc $_; ++$uniq1{$_} for @words; }

        $uniq1{$_} contains all the words from file1 like a dictionary. $uniq1{'anyword'} will be undef or 0 if 'anyword' was not in file1.
        Try a simple example

        #!perl my %uniq1= ( cow => 1, dog => 1, fox => 1, ); my @words = ('ant','bat','cat','dog','eel','fox'); my @match = grep $uniq1{$_}, @words; print "@match\n"
        poj