Re: Removing duplicates

Update: You might want to read perldoc -q duplicate for a general discussion of using a hash to check for duplicates.

I know giving a complete solution is sometimes frowned upon here, but sometimes I just can't help myself. Here is the first script:

    #!/usr/bin/perl
    use strict;
    use warnings;

    my %nets;
    /^Net '([^']+)'/ 
        and not $nets{$1}++ 
        and print "$1\n" 
      while <STDIN>;
[download]

And the second:

    #!/usr/bin/perl
    use strict;
    use warnings;
    use File::Slurp;

    my $file2 = shift
        or die "Usage: $0 file2 < another-file";
    my $nets = join "|", map {chomp;$_} read_file($file2);

    /$nets/
        or print
      while <STDIN>;
[download]

And you can run the whole chain with something like:

    perl script1.pl < file1 > file2
    perl script2.pl file2 < another-file
[download]

Update: just for fun, here's a version that will do everything in one step:

    #!/usr/bin/perl
    use strict;
    use warnings;
    use File::Slurp;

    my $file1 = shift
        or die "Usage: $0 file1 < another-file";
    my $nets = join "|", map { chomp;
                               /^Net '([^']+)'/ and $1 or ()
                             } read_file($file1);

    /$nets/
        or print
      while <STDIN>;
[download]

Which would be used like:

    perl script1+2.pl file1 < another-file
[download]

Comment on Re: Removing duplicates Select or Download Code

Replies are listed 'Best First'.
Re: Re: Removing duplicates by RCP (Acolyte) on Mar 03, 2004 at 11:50 UTC
Thanks for your help. One problem tho, my system HPUX does not support "FILE::SLURP", it there another approach to this step? Your first part did work great, can't wait to start PERL class! My shell scripts were taking minutes to do the things that PERL got done in mere seconds. Thanks again! RCP	[reply]
Re: Re: Re: Removing duplicates by revdiablo (Prior) on Mar 03, 2004 at 19:28 UTC
another approach to this step? Of course! :-) The idiomatic slurp goes something like this: `my $data = do { local (@ARGV, $/) = $filename; <> };` [download] You may see why I chose to use File::Slurp originally. To integrate that into the second snippet, it would be: `#!/usr/bin/perl use strict; use warnings; my $file = shift or die "Usage: $0 file2 < another-file"; my $file_contents = do { local (@ARGV, $/) = $file; <> }; my $nets = join "\|", map {chomp;$_} $file_contents; /$nets/ or print while <STDIN>;` [download] Note: this code is untested. Update: please note that some people get upset when you write "PERL." The language is called Perl, and the program that executes Perl code is called `perl`. Perl is not an acronym, so capitalizing its name makes it look like you're shouting.	[reply] [d/l] [select]
Re: Re: Re: Re: Removing duplicates by RCP (Acolyte) on Mar 04, 2004 at 12:33 UTC
Didn't mean to shout Perl, more of praise over what I have been using (Korn). I ran your program and it seems to do the opposite of what was expected. "another-file" should strip out any lines in "file2", thats contained in "another-file" and print out only the lines that are not contained in "another-file". I just found out that the book I brought, a few months ago, was not ideally suited for beginners, have any suggestions for a good beginners book? Thanks, RCP	[reply]