1s and 0s only? Any chance the lines are of the same size? Or actually even if they are not wouldn't it be better to store the data "directly" in binary instead of as a plain text file containin just 0s, 1s and newlines. Even if the lines were not the same length you might still store the data more efficiently. Storing the line length and using one bit instead of one byte for each 0/1. Not only you'll save lots of disk space and disk access, the comparisons will likewise be much quicker, with much less data to compare.
my $s1 = '10110101001010100011101101001010'; my $s2 = '10110101001010100011101101001011'; my $b1 = pack('b*', $s1); my $b2 = pack('b*', $s2); use Benchmark qw(timethese); timethese( 10000, { 'strings' => \&comp_strings, 'packs' => \&comp_packs, } ); sub comp_strings { for (1..1000) { $s1 eq $s1 and $s1 eq $s2 } } sub comp_packs { for (1..1000) { $b1 eq $b1 and $b1 eq $b2 } } __END__ Benchmark: timing 10000 iterations of packs, strings... packs: 4 wallclock secs ( 4.07 usr + 0.00 sys = 4.07 CPU) @ 246 +0.02/s (n=10000) strings: 5 wallclock secs ( 4.71 usr + 0.00 sys = 4.71 CPU) @ 212 +4.50/s (n=10000)
In reply to Re: What is a "big job" in the industry?
by Jenda
in thread What is a "big job" in the industry?
by punch_card_don
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |