Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks need some help here, My code checks a temp.txt file and find a string like;
<17>11,2002,FORD ,ECONO CO, ,34567676xxxxx <17>11,2002,FORD ,ECONO CO, ,34567676xxxxx

My problem is, it's finding what I am looking for but I want to make sure even if there is duplicated strings I print only one time, and my code is printing multiple times. How could I do that in my code and not allowing to print duplications of the found match.
Thank you!!!
<br> $temp_file = "${curr_dir}work/$AoH{20}_p22.txt"; open(TEMP, "$temp_file") || print "Can't open temp file: $temp_file\n" +; while(<TEMP>) { $_=~/<(.*?)>(.*)/; unless ($2 eq "*") { unless ($1 eq "20") { if ($1 eq "17") { my $hold=$2; $hold=~/^(\d\d),(.*?),(.*?),(.*?),(.*)/; $AoH{10}= $AoH{10}."<option value=\"$1\">$2,$3,$4 \n"; } else { $AoH{$1}=$2;} } } } close TEMP;

Replies are listed 'Best First'.
Re: String Duplication
by dragonchild (Archbishop) on Jul 16, 2003 at 15:01 UTC
    My first instinct is to tell you to use split for the second parsing action.

    Now, a big question I have is what does "duplicate" mean? For example,

    <17>11,2002,FORD ,ECONO CO, ,34567676xxxxx <17>11,2002,FORD ,ECONO CO, ,34567676xxxxx
    Are those duplicate? If those are not duplicate strings, then:
    my %seen; while (<TEMP>) { next if $seen{$_}++; # Rest of loop here. }
    However, if those are considered duplicates, you have to do duplicate checking in your second parse statement. Do something similar with the %seen hash. :-)

    ------
    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

      Yes they are duplicated strings on the txt file.
Re: String Duplication
by jmanning2k (Pilgrim) on Jul 16, 2003 at 15:09 UTC
    When you want to avoid duplicates, the automatic answer is 'use a hash'.

    Simply put some unique value (or the whole line) as a key for a hash. I'll let you decide what is an appropriate value for this key - it should be easily generated, and unique for each identical record.

    Do a:
    $hash{$key}++;
    when you find a value you want for the first time.

    Then, skip that line when you encounter the same key again. Put:
    next if(exists $hash{$key});
    at the top of your loop.
      How could I add that to the code comparing to what I am print using the $1,$2,$3 variables.
        Simple.
        $key = $1 . $2 . $3;
        With that and the code in my previous post, you won't print the same $1 $2 $3 combination twice.