in reply to Re^2: compare two text file line by line, how to optimise
in thread compare two text file line by line, how to optimise

the output that i wrote in the post is an error

But you haven't shown what the correct output should be so we can only guess what you are trying to do. Here's my guess, matching a combination of words from FIC with lines in FICC

#!/usr/bin/perl use strict; my @FIC = (); #open FIC,'<','fic.txt' or die "$!"; #while (my $line = <FIC>){ # next unless $line =~ /\S/; # my @words = split /\s+/,$line; # push @FIC,[ @words ]; #} #close FIC; @FIC = ( [ qw(chirac prime paris)], [ qw(chirac prime jacques) ], [ qw(chirac prime president) ], [ qw(chirac paris france) ], [ qw(chirac paris french) ], ); my $u=0; open FICC,'<','ficc.txt' or die "$!"; #open OUT, '>','output.txt' or die "$!"; while (my $line = <FICC>){ ++$u; next unless $line =~ /\S/; # skip blank lines for my $ar (@FIC){ my @matched = grep $line=~/$_/,@$ar; if (@matched == @$ar){ print "$u: $line matched all words : @matched\n\n"; #print OUT "$u: $line matched all words : @matched\n\n"; last; } } } close FICC; #close OUT __DATA__ chirac presidential migration chirac presidential paris jacques chirac has been the prime minster and the president chirac presidential 007 chirac paris migration chirac aaa french bbb paris ccc
poj

Replies are listed 'Best First'.
Re^4: compare two text file line by line, how to optimise
by thespirit (Novice) on Feb 26, 2016 at 14:23 UTC
    Thank you for the replay, i edited the posted with the correct output

      So, taking the first line of file 2

      chirac presidential migration

      compare this with each line of file 1 in turn

      chirac prime paris
      chirac prime jacques
      chirac prime president 
      chirac paris france
      chirac paris french
      

      and calculate how many words match. Output the file 1 line if the count is greater than a minimum value. Repeat for each line in file 2.

      For this example, the number of words matching is only 1 ("chirac") in each case so if the minimum is 2 then none of the above lines be would output. Is that the logic ?

      poj