Angharad has asked for the wisdom of the Perl Monks concerning the following question:

I don't think my perl skills are that bad these days, but I have a terrible mental block thing going on with hashes. I want to be able to do something now that I know can only be sensibly done with a hash.
This isn't the actual problem - but it explains what I want to do. I just need a kick in the right direction so that I can implement this in my own script now and others in the future.
Imagine I have a text file that looks like this.
HIT: code2 HIT: code3 HIT: code1 HIT: code90 HIT: code2 HIT: code34 HIT: code90
I would like to be able to create a hash that, for when, for example, code90 is first hit, its 'counted' (i.e. a variable is increased by one) and goes into the hash and is stored. As I am only interested in whether code90 exists, rather then how many times it exists, if I then encounter it again, I need to check the hash to see whether I have already counted it. If I have, I want to ignore that entry and move onto the next in the text file.
How do I go about doing this please?
Any help much appreciated.

Replies are listed 'Best First'.
Re: help on how to create a hash look up table requested.
by Fletch (Bishop) on Aug 10, 2006 at 14:10 UTC

    Erm, so just do it. You've said what you want, write the code.

    my %hits; while( <> ) { my( $key ) = /HIT:\s+(\S+)/; $hits{ $key }++; } my @distinct = keys %hits;

    Now you didn't say so, but if you want to preserve the order things appear in the input just push onto a list and use the hash to tell what you've already seen.

    my @hits; my %seen; while( <> ) { my( $key ) = /HIT:\s+(\S+)/; next if $seen{ $key }++; push @hits, $key; }
Re: help on how to create a hash look up table requested.
by McDarren (Abbot) on Aug 10, 2006 at 14:15 UTC
    Maybe something like this?
    #!/usr/bin/perl -w use strict; use Data::Dumper::Simple; my %codes; while (<DATA>) { chomp; my $foo = (split /:/, $_)[1]; if (!$codes{$foo}) { $codes{$foo}++; } } print Dumper(%codes); __DATA__ HIT: code2 HIT: code3 HIT: code1 HIT: code90 HIT: code2 HIT: code34 HIT: code90

    Which prints:

    %codes = ( ' code90' => 1, ' code2' => 1, ' code3' => 1, ' code2 ' => 1, ' code1' => 1, ' code34' => 1 );

    Actually, you don't really need the "if" block inside the while loop. You could just increment each hash value regardless. That won't give you any duplicates - it will just mean that some have a higher numerical value. Which (if I understand correctly) shouldn't make any difference.

    Hope this helps,
    Darren :)

      Minor nit, but if you amend the split slightly to my $foo = (split /:\s+/, $_)[1];you can eliminate the leading space from the hash keys. That space might trip you up later if you are expecting to be able to do something like if ( exists $codes{q{code90}} ) { ... }.

      Cheers,

      JohnGG

      ... while(<DATA>){ /:\s(\S+)$/ => $1 and $code{$1}++ } ...


Re: help on how to create a hash look up table requested.
by GrandFather (Saint) on Aug 10, 2006 at 22:28 UTC

    Looks like a slightly bigger question there than has been answered so far. Consider this:

    use warnings; use strict; my %codeData; my $skipping = 0; my $currCode; while (<DATA>) { if (/^HIT: (\S+)/) { $currCode = $1; $skipping = exists $codeData{$currCode}; next; } next if $skipping; $codeData{$currCode} .= $_; } print $codeData{$_} for keys %codeData; __DATA__ HIT: code2 stuff for code2 HIT: code3 stuff for code3 HIT: code1 stuff for code1 HIT: code90 stuff for code90 HIT: code2 extra stuff for code2 - ignore this HIT: code34 stuff for code34 HIT: code90 and extra stuff for code90 - ignore this

    Prints:

    stuff for code90 stuff for code3 stuff for code2 stuff for code34 stuff for code1

    Note in particular the use of exists to see if there is an element in the hash already. That frees you to store interesting stuff there rather than an explicit "found it" flag.


    DWIM is Perl's answer to Gödel
Re: help on how to create a hash look up table requested.
by Angharad (Pilgrim) on Aug 10, 2006 at 15:19 UTC
    thanks for the help. much appreciated