Re: Read file line by line and check equal lines

"Also it is huge file so i cannot use array or hash."

How huge?
Have you tried it with a hash - you might be surprised :)

Update: Note - as correctly pointed out by chrism01, the below won't work where you have odd numbers of duplicates. See below for a solution that I believe addresses that issue.

Give the following a go:

#!/usr/bin/perl -w
use strict;

my %wanted;

while (<DATA>) {
    exists $wanted{$_} ? delete $wanted{$_} : $wanted{$_}++;
}

print sort keys %wanted;

__DATA__
a1a
a1a
b1b
c1c
c1c
d1d
d1d
e1e
f1f
g1g
g1g
h1h
h1h
i1i
j1j
[download]

Output:

b1b
e1e
f1f
i1i
j1j
[download]

Update: or as a one-liner:

 perl -ne 'exists $x{$_}?delete $x{$_}:$x{$_}++;}{print for sort keys 
+%x;' < input.txt > output.txt
[download]

Try running that on your input file. The point about using a hash in that way is that you are only creating hash keys for those lines that are unique (and only appear once), so it's actually quite efficient. Whenever you are thinking "unique", a hash is almost certainly what you want.

Cheers,
Darren :)

Comment on Re: Read file line by line and check equal lines Select or Download Code

Replies are listed 'Best First'.
Re^2: Read file line by line and check equal lines by chrism01 (Friar) on Mar 06, 2007 at 23:39 UTC
Mcdarren, I like your 1st version, but it seems to me it'll only work for even nums of duplicates eg if an item occurs 3 (5,7,9...) times, it'll be re-instated/preserved by your script? Of course, the OP's example file only has duplicates in 2s, but the description doesn't state whether this is always the case. I agree about using a hash, but I'd keep a count of all lines and test for cnt == 1 after looping through the input Cheers Chris	[reply]
Re^3: Read file line by line and check equal lines by McDarren (Abbot) on Mar 07, 2007 at 07:15 UTC
Yes, you're absolutely correct - nice catch :) Here's an updated one-liner that addresses that problem in the way you suggest: `perl -ne '$x{$_}++;}{for(sort keys %x){print if $x{$_}==1;}' < input.t +xt` [download] (I'm not a golfer by any strech of the imagination, so I imagine that could be shortened significantly) Cheers, Darren :)	[reply] [d/l]