RCP has asked for the wisdom of the Perl Monks concerning the following question:

Got great help earlier, now have a couple of questions.

QUESTION #1

file1 data:

# This is a list of selected nets Net 'GROUND' top side Net 'GROUND' bottom side Net '/ad1' top side Net '/ad1' bottom side Net '/VCCA' top side Net '/VCCA' bottom side

a. I need to seek all lines that has the word "Net" in file1

b. remove lines that contain duplicates found in the 2nd column only, so that it outputs to file called "file2" to look like below:

'GROUND' '/ad1' '/VCCA'

Question #2

How can I take a file like "file2" and have perl read the column and strip out any lines in another file that are contained in the file2 listing?

While I have similar scripts written in UNIX (Korn), I'm trying to learn PERL's eqivalent code.

Thanks... RCP

edit by thelenm: added tags.

Replies are listed 'Best First'.
Re: Removing duplicates
by revdiablo (Prior) on Feb 25, 2004 at 22:08 UTC

    Update: You might want to read perldoc -q duplicate for a general discussion of using a hash to check for duplicates.

    I know giving a complete solution is sometimes frowned upon here, but sometimes I just can't help myself. Here is the first script:

    #!/usr/bin/perl use strict; use warnings; my %nets; /^Net '([^']+)'/ and not $nets{$1}++ and print "$1\n" while <STDIN>;

    And the second:

    #!/usr/bin/perl use strict; use warnings; use File::Slurp; my $file2 = shift or die "Usage: $0 file2 < another-file"; my $nets = join "|", map {chomp;$_} read_file($file2); /$nets/ or print while <STDIN>;

    And you can run the whole chain with something like:

    perl script1.pl < file1 > file2 perl script2.pl file2 < another-file

    Update: just for fun, here's a version that will do everything in one step:

    #!/usr/bin/perl use strict; use warnings; use File::Slurp; my $file1 = shift or die "Usage: $0 file1 < another-file"; my $nets = join "|", map { chomp; /^Net '([^']+)'/ and $1 or () } read_file($file1); /$nets/ or print while <STDIN>;

    Which would be used like:

    perl script1+2.pl file1 < another-file
      Thanks for your help. One problem tho, my system HPUX does not support "FILE::SLURP", it there another approach to this step? Your first part did work great, can't wait to start PERL class! My shell scripts were taking minutes to do the things that PERL got done in mere seconds. Thanks again! RCP
        another approach to this step?

        Of course! :-)

        The idiomatic slurp goes something like this:

        my $data = do { local (@ARGV, $/) = $filename; <> };

        You may see why I chose to use File::Slurp originally. To integrate that into the second snippet, it would be:

        #!/usr/bin/perl use strict; use warnings; my $file = shift or die "Usage: $0 file2 < another-file"; my $file_contents = do { local (@ARGV, $/) = $file; <> }; my $nets = join "|", map {chomp;$_} $file_contents; /$nets/ or print while <STDIN>;

        Note: this code is untested.

        Update: please note that some people get upset when you write "PERL." The language is called Perl, and the program that executes Perl code is called perl. Perl is not an acronym, so capitalizing its name makes it look like you're shouting.

Re: Removing duplicates
by talexb (Chancellor) on Feb 25, 2004 at 20:07 UTC

    I'm finding it hard to visualize what you are trying to do. Can you modify your original post so that it's more obvious?

    To figure out which thing is in one file but not another, put the conents of each file into separate hashes, and using the keys from the first hash, check the second hash. It's pretty standard Perl Cookbook (as published by O'Reilly's) stuff.

    Alex / talexb / Toronto

    Life is short: get busy!

      I have a file "file1" that contains: dave bob rich jim I have a second file "file2" that contains: dave rich I need "file2" listing to remove lines from "file1" to create a "file3" that contains only: bob jim I tried this code: #!/usr/bin/perl use strict; # use warnings; open(MYOUTFILE, ">file3"); open(MYOUTFILE, ">>file3"); my $file = shift or die "Usage: $0 file1 < file2"; my $file_contents = do { local (@ARGV, $/) = $file; <> }; my $nets = join "|", map {chomp;$_} $file_contents; /$nets/ or print MYOUTFILE while <STDIN>; close(MYOUTFILE); But it does not get me my example of file3 should look like. Help! Thanks.. RCP

        From the Perl debugger session I just ran:

        [alex@rand alex]$ perl -de 1 Loading DB routines from perl5db.pl version 1.19 Editor support available. Enter h or `h h' for help, or `man perldebug' for more help. main::(-e:1): 1 DB<1> @file1=qw/dave bob rich jim/; DB<2> @file2=qw/dave rich/; DB<3> foreach(@file1){$list{$_}=1;} DB<4> foreach(@file2){delete $list{$_};} DB<5> print "Remaining elements " . join("; ",keys %list) . "\n"; Remaining elements jim; bob DB<6>
        You can fiddle with the 'list' hash as you read the file, so there's no need to use an array as the middle man.

        Alex / talexb / Toronto

        Life is short: get busy!

        PS: When posting code, put it in between 'code' tags. That way it doesn't wrap like normal text.

Re: Removing duplicates
by delirium (Chaplain) on Feb 25, 2004 at 21:50 UTC
    Something like...

    my %hash=(); while(<>) { if (/Net '(\w+)'/){$hash{$1} = 1;} } print "$_\n" for keys %hash;
    ...will get you through the first question.