I'm not drunk enough to fully understand the code, but the math called me.
Perl v5.18.0 required--this is only v5.14.2, stopped at q line 2.
:(
I feel left out, because your code does not run on my system (an up-to-date Debian Stable release), it seems all you do is use "say", which means you should change
use 5.018;
into
use v5.10;
And it runs for me too..
:)
Certainly interesting. Going beyond a mere wordcount.
You also may want to throw in a undef $/; to slurp in more than the first line (or did you plan to enhance the algorithm to seek inserted lines, a bit what diff does)?
You can also count the hash like this
$F1{$_}++ for @F1; $F1{$_}-- for @F2;
if equal it is 0, if positive, the second file was missing it (or one ocurrance), if negative... well.. And depending on the farness from 0 you can build your stats. Takes less mem too.
In reply to Re: File Similarity Concept
by FreeBeerReekingMonk
in thread File Similarity Concept
by ww
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |