String Duplication

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks need some help here, My code checks a temp.txt file and find a string like;

<17>11,2002,FORD    ,ECONO CO,        ,34567676xxxxx   
<17>11,2002,FORD    ,ECONO CO,        ,34567676xxxxx
[download]

My problem is, it's finding what I am looking for but I want to make sure even if there is duplicated strings I print only one time, and my code is printing multiple times. How could I do that in my code and not allowing to print duplications of the found match.
Thank you!!!

<br>
$temp_file = "${curr_dir}work/$AoH{20}_p22.txt";

open(TEMP, "$temp_file") || print "Can't open temp file: $temp_file\n"
+;
while(<TEMP>) {
    $_=~/<(.*?)>(.*)/;
    
    unless ($2 eq "*") {
    unless ($1 eq "20") {
        if ($1 eq "17") {
        my $hold=$2;
        $hold=~/^(\d\d),(.*?),(.*?),(.*?),(.*)/;
        $AoH{10}= $AoH{10}."<option value=\"$1\">$2,$3,$4 \n"; }
        else {
        $AoH{$1}=$2;}
    }
    }
}
close TEMP;
[download]

Comment on String Duplication Select or Download Code

Replies are listed 'Best First'.
Re: String Duplication by dragonchild (Archbishop) on Jul 16, 2003 at 15:01 UTC
My first instinct is to tell you to use split for the second parsing action. Now, a big question I have is what does "duplicate" mean? For example, `<17>11,2002,FORD ,ECONO CO, ,34567676xxxxx <17>11,2002,FORD ,ECONO CO, ,34567676xxxxx` [download] Are those duplicate? If those are not duplicate strings, then: `my %seen; while (<TEMP>) { next if $seen{$_}++; # Rest of loop here. }` [download] However, if those are considered duplicates, you have to do duplicate checking in your second parse statement. Do something similar with the %seen hash. :-) ------ We are the carpenters and bricklayers of the Information Age. Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement. Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.	[reply] [d/l] [select]
Re: Re: String Duplication by Anonymous Monk on Jul 16, 2003 at 15:10 UTC
Yes they are duplicated strings on the txt file.	[reply]
Re: String Duplication by jmanning2k (Pilgrim) on Jul 16, 2003 at 15:09 UTC
When you want to avoid duplicates, the automatic answer is 'use a hash'. Simply put some unique value (or the whole line) as a key for a hash. I'll let you decide what is an appropriate value for this key - it should be easily generated, and unique for each identical record. Do a: `$hash{$key}++;` when you find a value you want for the first time. Then, skip that line when you encounter the same key again. Put: `next if(exists $hash{$key});` at the top of your loop.	[reply] [d/l] [select]
Re: Re: String Duplication by Anonymous Monk on Jul 16, 2003 at 15:49 UTC
How could I add that to the code comparing to what I am print using the $1,$2,$3 variables.	[reply]
Re: Re: Re: String Duplication by jmanning2k (Pilgrim) on Jul 16, 2003 at 17:30 UTC
Simple. `$key = $1 . $2 . $3;` [download] With that and the code in my previous post, you won't print the same $1 $2 $3 combination twice.	[reply] [d/l]