in reply to Read file line by line and check equal lines

"Also it is huge file so i cannot use array or hash."

How huge?
Have you tried it with a hash - you might be surprised :)

Update: Note - as correctly pointed out by chrism01, the below won't work where you have odd numbers of duplicates. See below for a solution that I believe addresses that issue.

Give the following a go:

#!/usr/bin/perl -w use strict; my %wanted; while (<DATA>) { exists $wanted{$_} ? delete $wanted{$_} : $wanted{$_}++; } print sort keys %wanted; __DATA__ a1a a1a b1b c1c c1c d1d d1d e1e f1f g1g g1g h1h h1h i1i j1j
Output:
b1b e1e f1f i1i j1j

Update: or as a one-liner:

perl -ne 'exists $x{$_}?delete $x{$_}:$x{$_}++;}{print for sort keys +%x;' < input.txt > output.txt

Try running that on your input file. The point about using a hash in that way is that you are only creating hash keys for those lines that are unique (and only appear once), so it's actually quite efficient. Whenever you are thinking "unique", a hash is almost certainly what you want.

Cheers,
Darren :)

Replies are listed 'Best First'.
Re^2: Read file line by line and check equal lines
by chrism01 (Friar) on Mar 06, 2007 at 23:39 UTC
    Mcdarren,
    I like your 1st version, but it seems to me it'll only work for even nums of duplicates eg if an item occurs 3 (5,7,9...) times, it'll be re-instated/preserved by your script?
    Of course, the OP's example file only has duplicates in 2s, but the description doesn't state whether this is always the case.
    I agree about using a hash, but I'd keep a count of all lines and test for cnt == 1 after looping through the input

    Cheers
    Chris

      Yes, you're absolutely correct - nice catch :)

      Here's an updated one-liner that addresses that problem in the way you suggest:

      perl -ne '$x{$_}++;}{for(sort keys %x){print if $x{$_}==1;}' < input.t +xt

      (I'm not a golfer by any strech of the imagination, so I imagine that could be shortened significantly)

      Cheers,
      Darren :)