Run this simple program with minimal processing against your data and post the results. This will help eliminate one potential source of your problem (i/o) and provide a better indication of your data than just a size of 50M
poj#!/usr/bin/perl use strict; my $t0 = time; my $file1 = $ARGV[0] || 'ficc.txt'; my $file2 = $ARGV[1] || 'fic.txt'; my $count1=0; my $words1=0; open FICC,'<',$file1 or die "$file1 : $!"; while (<FICC>) { my @words = split /\s+/,lc $_; $words1 += @words; ++$count1; } close FICC; my $count2=0; my $words2=0; open FIC,'<',$file2 or die "$file2 : $!"; while (<FIC>) { my @words = split /\s+/,lc $_; $words2 += @words; ++$count2; } close FICC; my $dur = int time-$t0; print " File1 : $count1 lines $words1 words in $file1 File2 : $count2 lines $words2 words in $file2 Time : $dur seconds\n";
In reply to Re^5: compare two text file line by line, how to optimise
by poj
in thread compare two text file line by line, how to optimise
by thespirit
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |