ExReg has asked for the wisdom of the Perl Monks concerning the following question:

As a follow up to 1161099, I have an array of stuff that I have extracted from a big file via the use of regexes, call it @excerpts. Each array entry is then processed with regexes to get additional pieces of information. I would like to store those pieces of information in that same array, making a sort of array of hashes.

In the example I gave, my file contents are

$fc = 'abcdfoofrobnicatebardefforspambazghi';

I get my @excerpts array filled with

$re2 = qr/(fo.)(.*?)(ba.)/; push @excerpts, $1 while $fc =~ /($re2)/g;

I now have the @excerpts array

0:foofrobnicatebar 1:forspambaz

I now use the same $re2 regex to get

0:$1='foo' $3='bar' 1:$1='for' $3='baz'

I would like to add these to the array, and make a sort of array of hashes. I would like to add {fpart} for $1 and {bpart} for $3 to each entry in @excerpts. The problem I have is that the only way I have found to do that is so ugly that I am afraid I will have bad Perl nightmares this weekend. The only way I have found to add or read the hash values is

$%{$excerpts[$i]}{fpart} = $1; $%{$excerpts[$i]}{bpart} = $2;

They can be read back with the exact same expression. This construct just looks so unnatural that I am afraid it will invoke some unnatural activity from my PC when I have my back turned. None of the normal AoH notation worked in this case. Is there a better expression?

Replies are listed 'Best First'.
Re: Adding hashes to already existing array
by Marshall (Canon) on May 06, 2016 at 20:00 UTC
    I am not completely sure that I understand what you are trying to do. Some of this looks a bit overly complex. Why wouldn't you just use an Array of Array with all 3 parts that you need? Why does there need to be a hash at all?. A simple modification to your @excerpts creation code could be like this:
    use warnings; use Data::Dumper; $fc = 'abcdfoofrobnicatebardefforspambazghi'; $re2 = qr/(fo.)(.*?)(ba.)/; my @excerpts; push @excerpts, [$1,$2,$3] while $fc =~ /$re2/g; print Dumper \@excerpts; # Simple iteration over the Array of Array (AoA) # for each row break this down into names that make sense # for your data, whatever they are.. use those names # instead of indicies to process the data. foreach my $rowref (@excerpts) { my($namea,$nameb,$namec) = @$rowref; print "first=$namea second=$nameb third=$namec\n"; } __END__ $VAR1 = [ [ 'foo', 'frobnicate', 'bar' ], [ 'for', 'spam', 'baz' ] ]; first=foo second=frobnicate third=bar first=for second=spam third=baz
    Echoing stevieb's comment, it would help if you could manually create an example of the structure that you are trying to build. Or maybe back up a bit and explain how you intend to use the structure and we can suggest some possibilities?

    Update: added how to use AoA to the code above.

      The reason I do not use an AoA is because I have several different already existing arrays created similar to @excerpts. Each array has a different number of things that are being extracted, but many of them share similar sub-searches. i.e. I have @excerpts1, @excerpts2, ... and the $re1, $re2, ... will have some (fo.) and (ba.) matching expressions that they would have in common.

      Not sure that makes sense, but different arrarys would have different number of sub-arrays, and I would have trouble keeping them straight if I didn't use hash keys.

      Sorry, It's a Friday, and my brain has already been home for several hours now.

Re: Adding hashes to already existing array
by stevieb (Canon) on May 06, 2016 at 17:50 UTC

    Could you please show us, in pseudo-code if not an actual data structure in Data::Dumper style format, a snip of exactly how you want your structure to look like?

      $fc = 'abcdfoofrobnicatebardefforspambazghi'; $re2 = qr/(fo.)(.*?)(ba.)/; push @excerpts, $1 while $fc =~ /($re2)/g; print "0: $excerpts[0]; print "0: $excerpts[0]; for my $i ( 0 .. 1 ) { $excerpts[i] =~ /$re2/; $%{$excerpts[$i]}{fpart} = $1; $%{$excerpts[$i]}{bpart} = $3; } print "0{fpart}: $%{$excerpts[0]}{fpart}\n; print "0{bpart}: $%{$excerpts[0]}{bpart}\n; print "1{fpart}: $%{$excerpts[1]}{fpart}\n; print "1{bpart}: $%{$excerpts[1]}{bpart}\n; __END__ 0:foofrobnicatebar 1:forspambaz 0{fpart}: foo 0{bpart}: bar 1{fpart}: for 1{bpart}: baz

      Hope I typed all this correctly

      Edit: changed $2 to $3 in for loop. Running it on PC with perl yields above results with exception of 1{bpart} for some reason

      Addendum: I also realize that I could have started out with an array of hashes and put the original contents into $excerpts[$i]{contents} and then from it gotten $excerpts[$i]{bpart} $excerpts[$i]{fpart} with the normal AoH syntax, but this question is for an already existing array with stuff in $excerpts[$i].

        Hi ExReg,

        I'd recommend you take a look at perldsc for a cookbook of different data structures. You should also always Use strict and warnings, especially when working with complex data structures - and your code contains a typo that prevents it from working properly and that use strict; would have caught! Also, please post code that compiles, you're missing several closing quotes.

        The syntax "$%{$excerpts[$i]}{fpart}" is probably not doing what you want - it's populating a hash "%%"!

        Here's one way to do what you want. Note that $excerpts[i]{fpart} = ... would not work, since at that point $excerpts[i] is a string, not a hash ref, that's why I replace that element of @excerpts with a new hashref.

        use warnings; use strict; use Data::Dumper; my $fc = 'abcdfoofrobnicatebardefforspambazghi'; my $re2 = qr/(fo.)(.*?)(ba.)/; my @excerpts; push @excerpts, $1 while $fc =~ /($re2)/g; for my $i ( 0 .. $#excerpts ) { $excerpts[$i] =~ /$re2/; $excerpts[$i] = { fpart=>$1, bpart=>$3 }; } print Dumper(\@excerpts); __END__ $VAR1 = [ { 'bpart' => 'bar', 'fpart' => 'foo' }, { 'fpart' => 'for', 'bpart' => 'baz' } ];

        Hope this helps,
        -- Hauke D