in reply to creating a new file with unique values

First, please use <CODE> tags around your code. Second, always use strict and warnings

Now for your problem. If you have the memory, put your data into a hash. Don't print it to the output file until you've read all the data from the input file.

my $file = "project.txt"; # Use three-argument form of open() (available in perl 5.6.0) # and check the return value. open(FILE, '<', $project) or die "Can't open $project: $!\n"; my %input; while(my $line = <FILE>) { chomp $line; # Get rid of whitespace (newline) at the end of the str +ing if($line =~ /\tCM+(\d*)/io) { $input{$1}++; } } close(FILE);

After the above code runs, %input will contain the digits as keys, with the value being the number of times that key shows up in the input file. Printing to the output file is even easier. Just check if the value of in the %input hash is greater than 1 before printing:

open(OUT, '>>', 'project.out') or die "Can't open project.out for writ +ing: $!\n"; foreach my $i (keys %input) { print OUT "$i\n" unless $input{$i} > 1; } close OUT;

Replies are listed 'Best First'.
Re: Re: creating a new file with unique values
by poj (Abbot) on Jan 20, 2003 at 17:52 UTC
    Correction
    open(FILE, '<', $file) or die "Can't open $file: $!\n";
    poj
Re: Re: creating a new file with unique values
by Gilimanjaro (Hermit) on Jan 20, 2003 at 17:51 UTC
    Why would you want to delay writing the file? If no hash-entry exists yet, you know it can be written anyway...

    Also using foreach/keys to loop thru a hash if very inefficient, especially with big hashes; perl has to traverse the entire hash to collect all the keys, and when obtaining the value in the loop body, it has to look-up the key in the hash again.

    The preferred method would be to use a while/each loop. Your code would then look like:

    while(my ($i,$count) = each %input) { print OUT "$i\n" unless $count>1; }

    Using the while/each construct would also be a lot cleaner if the hash happened to be something like a tied database query result hash, if said hash supported database row cursors... But that actually has nothing to do with the topic... :)

    Happy coding, G.