yacoubean has asked for the wisdom of the Perl Monks concerning the following question:

I was able to fix this bug, but I don't know why it was a bug.
I am finding all the ColdFusion custom tags in my site, and they are formatted like <cf_capitalize> or <cf_confirmbutton Tabindex=98 name="Next">. My regex is returning the stuff after '<cf_' and before any whitespace or '>'. Then I need to append '.cfm' to the returned string. My code below was causing weird things to happen in the hash. It was creating 2 hash keys for this value, and they looked like this:
capitalize
capitalize.cfm1
while ($_ =~ /(<cf_)(.*?)[>\s]/gi) { # finding custom tags $uniques{"$2.cfm"}++; $uniques{$2} = "<cust>"; }
I was able to fix it by removing the .cfm in the above code, like so:
while ($_ =~ /(<cf_)(.*?)[>\s]/gi) { # finding custom tags $uniques{"$2"}++; $uniques{$2} = "<cust>"; }
Then later I append the '.cfm'. Any ideas as to why this was happening to me?

Replies are listed 'Best First'.
Re: Two hash keys created when '.' is in the string
by Mutant (Priest) on Nov 24, 2004 at 18:04 UTC
    The line:
    $uniques{"$2.cfm"}++;

    Increments the value stored under hash key "$2.cfm". If you haven't set it to anything, it'll end up with a value of 1 (0 + 1 = 1) :)

    You then set the value of hash key "$2" to '<cust>'.

    In your modified version, the line:

    $uniques{"$2"}++;
    Is redundant, since you're overwriting the value on the next line. I think you're looking for something like this:
    while ($_ =~ /(<cf_)(.*?)[>\s]/gi) { # finding custom tags $uniques{$2}++; }

    Then you can print out a list of how many times each custom tag is being called with something like this:

    while (my ($code, $count) = each %uniques) { print "$code => $count\n"; }
    Or, do whatever else you want with the results...
      Oh! Now I feel dumb. See, the second line is there because I need to flag the value as a custom tag, because I am also returning href=, action=, and other 'links'. But I was being really dumb, not realizing that I was creating 2 different keys, one named '$2.cfm' and the other '$2'. <smacks head in disbelief> :)
      If you're curious, here's one of my other sections that worked correctly:
      while ($_ =~ /(href)[ ="']+(.*?)["'>|\?]/gi) { # finding linked files if (($2 =~ /css/i)|($2 eq $cfm)) { # skipping css files and links t +o self } else { $uniques{"$2"}++; $uniques{$2} = "<norm>"; } }
        I doubt that actually is working. A few problems:
        • You shouldn't ever need to use "$2". It's just the same thing as $2.
        • So you're still assigning two values to the same hash key. The second one overwrites the first, so that $uniques{"$2"}++; line does nothing.
        • Also, | is the bitwise or operator. You want || or or.
        • And finally, running any regexp match overwrites the values of all the number variables (for instance, $2), so after checking $2 =~ /css/i, $2 will be undef.