in reply to Are two lines in the text file equal
I'm surpised that no one has yet mentioned an on disk database, given the constraint on memory.
It's painful, and probably not nearly so fast as an in-mem hash, but it should work. However, now the next question is how much disk space have you got? The indexes that DB_File generates can get to be VERY large.use DB_File ; tie %h, "DB_File", $dbFname ; while( <$f> ) { if( exists $h{$_} ) { # equal lines in file } $h{$_} = 1 ; # or some piece of data }
As for speed, there are quite a few options to DB_File and you may spend some time 'tuning' it. Sensitize oneself to the concept of "fast" vs "fastest" vs "working at all".
|
|---|