in reply to File Checking

The way I've always done this (and there are probably better ways)..
#!/usr/bin/perl -w use strict; my @words; my %list; unless ($ARGV[1]){ print "Usage is oneword infile outfile\n"; exit; } open (WORDLIST, "$ARGV[0]")||die "Could not open file $!"; open (OUTFILE, ">$ARGV[1]")||die "Could not open file $!"; while (<WORDLIST>){push (@words => $_)} foreach (@words){$list{$_}=$list{$_}} #this works because if the hash +key #exists already it is replaced! foreach (keys %list){print OUTFILE;}

Replies are listed 'Best First'.
Re: Re: File Checking
by eg (Friar) on Jan 29, 2001 at 01:14 UTC

    Hey slycer. You're looping around your data far too often.

    while ( <WORDLIST> ) { print OUTFILE unless $list{lc($_)}++; }

    will be up to three time faster as it only needs to go through the data once.

    (Whoops! This is essentially what IO said.)

      Yah, I wrote that probably about 6 months ago as kind of a learning excersise.. I knew as soon as I posted it that it could be better, but oh well. Thanks for pointing out the errors :-)
Re: Re: File Checking
by Chady (Priest) on Jan 29, 2001 at 00:30 UTC
    A quick test revealed a flaw here:
    what if we had :
    someone@somewhere.com and someOne@SOMEWHERE.COM

    and this is the main reason for checking for duplicate in the first place I guess..


    Chady | http://chady.net/