Re: reading/writing to a file

Above solutions look good, but just to mention another tool in the toolbox: If this is on *nix, keep on mind the sort and uniq commands. For example, your perl could just create a raw dictionary file, not worrying about duplicates (and thus eliminating the need for a possibly very large in-memory hash), and then just invoke sometihng like:

system("sort raw_outfile | uniq > real_outfile");
unlink "raw_outfile";
[download]

Not sure if it's the best use here, but in general sort/uniq on the cmdline is very useful.

Comment on Re: reading/writing to a file Select or Download Code

Replies are listed 'Best First'.
Re^2: reading/writing to a file by tlm (Prior) on Jun 18, 2005 at 20:14 UTC
I agree that using Unix utilities is a good alternative for this problem, but note that For the purpose of this problem at least (and, AFAIK, always), `sort foo.txt \| uniq` [download] can be replaced with a single `sort` command: `sort -u foo.txt` [download] By itself, `sort -u` (or `sort ... \| uniq`) is not enough to solve this problem. Something like GNU's `comm` is also required (zsh, YMMV): `% (comm -2 -3 sorted_new.txt sorted_exclude.txt; < dict.txt) \| sort -u + \ > tmp % mv tmp dict.txt` [download] (BTW, if anyone knows how to avoid the temporary file above, I'd love to hear about it.) the lowliest monk	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^2: reading/writing to a file
by tlm (Prior) on Jun 18, 2005 at 20:14 UTC

I agree that using Unix utilities is a good alternative for this problem, but note that

For the purpose of this problem at least (and, AFAIK, always),
```
sort foo.txt | uniq
[download]
```
can be replaced with a single sort command:
```
sort -u foo.txt
[download]
```
By itself, sort -u (or sort ... | uniq) is not enough to solve this problem. Something like GNU's comm is also required (zsh, YMMV):
```
% (comm -2 -3 sorted_new.txt sorted_exclude.txt; < dict.txt) | sort -u
+ \
> tmp
% mv tmp dict.txt
[download]
```
(BTW, if anyone knows how to avoid the temporary file above, I'd love to hear about it.)

the lowliest monk

[reply]
[d/l]
[select]