in reply to Re: compare files by words
in thread compare files by words

Its almost 2 same files. They differs in diacritic only. And i just need to know how many words have different diacritic. I dont need to know details.

Replies are listed 'Best First'.
Re^3: compare files by words
by blazar (Canon) on May 31, 2007 at 08:25 UTC
    Its almost 2 same files. They differs in diacritic only. And i just need to know how many words have different diacritic. I dont need to know details.

    In this case your approach above seems fine. Did you try it? Did it fail somehow? One thing you "have" to do is to make it strict-safe. Then, for words comparison I'd write:

    no warnings 'uninitialized'; ($words1[$_] eq $words2[$_] ? $good : $bad)++ for 0..(@words1>@words2 ? $#words1 : $#words2);

    (I suppose you want to count a word as bad if it has not a correspondent one at all. Otherwise you should change > into <. In the latter case no wouldn't be necessary.)

    Update: you also probably don't want to split on / /, but on ' ' which is more likely to do what you mean, and in fact is also the default.

      How about
      use List::Util "min"; ... my $words = min(@words1, @words2); $total += $words; $bad += grep $words1[$_] ne $words2[$_], 0 .. ($words - 1); ... print "good:", $total - $bad;
      possibly switching max for min.
        use List::Util "min";

        Yep, I like and use that. But for two values only a simple ternary is more appropriate IMHO. Of course keeping a $total is also fine, interesting and I had thought of it myself. But after all the $bad vs $good one is more symmetric and thus I prefer it. That's just me of course.