Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

re-useing a hash

by ministry (Scribe)
on Apr 20, 2005 at 15:43 UTC ( [id://449633]=perlquestion: print w/replies, xml ) Need Help??

ministry has asked for the wisdom of the Perl Monks concerning the following question:


my issue today is with re-initializing a hash. in the following code (i posted it all, to help minimize the confusion) i use a hash as a counter for every instance i run into a specific pattern match. in this code i look through a firewall config file, and the same firewalls logfile for rule numbers. the config file will yield a result of all rules listed on the firewall, while the logfile search will result in a total count of each rule that it has found.

($rule)="$1"; $hash{$rule}=($hash{$rule}+1);
with this being the only way i know how to increment a count of undefined items, i have not found a way to properly clear this hash before assigning it to the task of searching through my next firewalls config/log file. in searching the web, the only item i have found is the use of %hash=() however this will not eleminate each key/value pair from my hash.
#!/usr/bin/perl $reportdir="/report/csv"; @basedirs=qw</log /log2>; foreach $starting_point (@basedirs){ &recursion("$starting_point"); } sub recursion { my $dir=shift; opendir(DIR, $dir) or return; my @contents=map "$dir/$_",sort grep !/^\.\.?$/,readdir DIR; closedir DIR; foreach (@contents) { next unless !-l && -e; &recursion($_); next if -d | /gz/ | ! /20050412/; if (/12\/(\w+(-\w+)??)(-\d)??-20050412/){push(@list,$1);} } } sub seek{ my $directory=shift; opendir(LIFT, $directory) or return; my @closed=map "$directory/$_",sort grep !/^\.\.?$/,readdir LIFT; closedir LIFT; foreach (@closed){ next unless !-l && -e; &seek($_); next if -d | ! /$indiv/; next if ! /20050412/; push(@cleaned,$_); } } sub gather{ `tar xvf /var/config/$uniqu-config.20050412`; $config_fw=""; open (GATEWAY,"$config_fw") || die "cant open gateway:$!"; while (<GATEWAY>){ if (/\#(\d+)/) { $rule="$1"; $hash{$rule}="0"; } } close GATEWAY; unlink ($config_fw); } %seen=(); @uniqu=grep { ! $seen{$_} ++ } @list; foreach $uniqu (@uniqu){ %hash=(); if ($uniqu=~/tias/){$tune="/log";$indiv="$uniqu-";} else {$tune="/log2";$indiv="$uniqu-";} open(REPORT,">$reportdir/$uniqu-20050412"); &seek($tune); &gather; for $i (@cleaned){ open (FH,"$i"); while (<FH>){ if (/rule\=(\d+)/){ ($rule)="$1"; $hash{$rule}=($hash{$rule}+1); } } } foreach $rule (sort{$a<=>$b} keys(%hash)){ print REPORT "$uniqu,12/04/2005,$rule,$hash{$rule}\n"; } close REPORT; close FH; }
any help or advice (aside from suggesting i re-write the entire code) would be greatly appreciated.
regards, ev

Good judgement comes with experience. Unfortunately, the experience usually comes from bad judgement.

Replies are listed 'Best First'.
Re: re-useing a hash
by Scarborough (Hermit) on Apr 20, 2005 at 15:53 UTC
    #You can increase a count by %hash{$rule}++; #and empty by undef %hash;
Re: re-useing a hash
by isotope (Deacon) on Apr 20, 2005 at 15:53 UTC
    1. Localize the hash to your foreach loop (%hash = (); becomes my %hash;).
    2. Remove the reference to your hash from sub gather completely.
    3. Increment your hash counter like this instead: $hash{$rule}++;

Re: re-useing a hash
by brian_d_foy (Abbot) on Apr 20, 2005 at 16:46 UTC

    It looks like you want a lexical variable. Use my() to make the variable private to the block. With each iteration, you get a different hash (and at the end, throw the current on away).

    foreach $uniqu ( ... ){ my %hash; ...

    However, what you have now should be clearing out the hash. Setting it to the empty list shouldn't leave anything in it. There might be something else going on if you are seeing odd results.

    Good luck!

    brian d foy <>
      thanks for the info brian_d_foy. however after taking your advice and making my scalar a lexical i am still unable to re-use my hash each time loop through my foreach $uniqu ... statement. i also took the advice given by isotope and Scarborough and made my counter look like $hash{$rule}++ and also tried to use the  undef %hash; at the end of my loop, but still with no luck. here is what my loop looks like now:
      foreach $uniqu (@uniqu){ my %hash; ...(process each key/value)... undef %hash; }
      it still seems as though every time i go through my foreach loop, the clearing of my %hash is not taking place. any other suggestions?

      Good judgement comes with experience. Unfortunately, the experience usually comes from bad judgement.
        Then me thinks you have some other bug making you think the hash isn't empty. Actually blindly following the advice given so far has caused some additional problems, namely scoping issues.

        Lets examine a quick example:

        #!/usr/bin/perl use strict; use warnings; use vars qw(%hash); sub foo { $hash{$_[0]}+=3; } sub bar { $hash{$_[0]}++; foo($_[0]); } foreach my $i (0..10) { my %hash; $hash{$i} = $i*3; bar($i); foo($i); foreach my $j (sort keys %hash) { print "hash{$j} = $hash{$j}\n"; } } print "done\n"; foreach my $j (sort keys %hash) { print "hash{$j} = $hash{$j}\n"; }
        Ok, it looks a bit convoluted, definitely contrived, but hey it's just an example. Running through the program through our brains, we're creating a hash with a key of 0 to 10, and setting its value to 3*x, then calling bar() using that index. Bar increments that hash element x by one, then calls foo() which adds 3 more so the value of %hash{0} at this point should be 4. Then we call foo() again, so we're up to 7. %hash{1} should similarly get a value of 3+1+3+3=10 etc.

        Oh, and each time through we should end up with a new empty hash too, so if we see repeaters, then the hash is sticking around

        Run the code and we get this output:

        hash{0} = 0 hash{1} = 3 hash{2} = 6 hash{3} = 9 hash{4} = 12 hash{5} = 15 hash{6} = 18 hash{7} = 21 hash{8} = 24 hash{9} = 27 hash{10} = 30 done hash{0} = 7 hash{1} = 7 hash{10} = 7 hash{2} = 7 hash{3} = 7 hash{4} = 7 hash{5} = 7 hash{6} = 7 hash{7} = 7 hash{8} = 7 hash{9} = 7
        Well that wasn't what we expected was it. We don't have any repeaters, so we are getting a new hash, but the values aren't correct. And why do we have a %hash with values in it to print at the end?

        Scoping is causing issues for you. Neither foo() nor bar() know anything about the private %hash we created by the "my %hash;" line within the foreach loop. bar() is creating a new global version of that hash and hash entry when you call it, and foo() is likewise acting on that new global version of %hash. That is why the initially printed values are only multiplied by three and the remaining global %hash all have values of 7 once you're out of the foreach loop .

        So, how do you fix the problem? Either don't use "my %hash;", and continue to use a global %hash variable (not really recommended, but it works) or start passing a reference to the hash to the subroutines that need to access it. If you don't use "my %hash", then the "undef %hash;" line will be needed at the end of your foreach loop.

        Update: You can't use a "my %hash" in this case, as you can't take a reference to a my variable (at least I'm pretty sure you can't) since it doesn't exist on the glob table like a normal variable does. Just add the "undef %hash;" at the end of the foreach loop and it should start working more correctly than it currently does.


        Update: Fixed various typo's... Update 2: Made the explaination a bit more clear, and fixed the "reference" idea

        Did you follow step 2 of my advice?


Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://449633]
Approved by Scarborough
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (5)
As of 2024-04-22 16:51 GMT
Find Nodes?
    Voting Booth?

    No recent polls found