in reply to Removing duplicates

Update: You might want to read perldoc -q duplicate for a general discussion of using a hash to check for duplicates.

I know giving a complete solution is sometimes frowned upon here, but sometimes I just can't help myself. Here is the first script:

#!/usr/bin/perl use strict; use warnings; my %nets; /^Net '([^']+)'/ and not $nets{$1}++ and print "$1\n" while <STDIN>;

And the second:

#!/usr/bin/perl use strict; use warnings; use File::Slurp; my $file2 = shift or die "Usage: $0 file2 < another-file"; my $nets = join "|", map {chomp;$_} read_file($file2); /$nets/ or print while <STDIN>;

And you can run the whole chain with something like:

perl script1.pl < file1 > file2 perl script2.pl file2 < another-file

Update: just for fun, here's a version that will do everything in one step:

#!/usr/bin/perl use strict; use warnings; use File::Slurp; my $file1 = shift or die "Usage: $0 file1 < another-file"; my $nets = join "|", map { chomp; /^Net '([^']+)'/ and $1 or () } read_file($file1); /$nets/ or print while <STDIN>;

Which would be used like:

perl script1+2.pl file1 < another-file

Replies are listed 'Best First'.
Re: Re: Removing duplicates
by RCP (Acolyte) on Mar 03, 2004 at 11:50 UTC
    Thanks for your help. One problem tho, my system HPUX does not support "FILE::SLURP", it there another approach to this step? Your first part did work great, can't wait to start PERL class! My shell scripts were taking minutes to do the things that PERL got done in mere seconds. Thanks again! RCP
      another approach to this step?

      Of course! :-)

      The idiomatic slurp goes something like this:

      my $data = do { local (@ARGV, $/) = $filename; <> };

      You may see why I chose to use File::Slurp originally. To integrate that into the second snippet, it would be:

      #!/usr/bin/perl use strict; use warnings; my $file = shift or die "Usage: $0 file2 < another-file"; my $file_contents = do { local (@ARGV, $/) = $file; <> }; my $nets = join "|", map {chomp;$_} $file_contents; /$nets/ or print while <STDIN>;

      Note: this code is untested.

      Update: please note that some people get upset when you write "PERL." The language is called Perl, and the program that executes Perl code is called perl. Perl is not an acronym, so capitalizing its name makes it look like you're shouting.

        Didn't mean to shout Perl, more of praise over what I have been using (Korn). I ran your program and it seems to do the opposite of what was expected. "another-file" should strip out any lines in "file2", thats contained in "another-file" and print out only the lines that are not contained in "another-file". I just found out that the book I brought, a few months ago, was not ideally suited for beginners, have any suggestions for a good beginners book? Thanks, RCP