Hello hkates,

You’ve already been given code that solves your problem, but I want to give you some pointers on how to develop a Perlish solution.

I don't know where to start and am thinking perl may not be the best thing for the job.
...
I know that I wouldn't be able to create the hashes in that if statement. I would need to create a hash of all the unique $1 first (so that I could use the exists function) and then for each key in that hash, read through the file again, creating a new hash for each key in the original hash.

And there’s your problem! You don’t need to create the hash keys first, you can create them on-the-fly as needed; and autovivification makes Perl the perfect tool for this job!.

But the key to solving this problem is getting the data structure right. It helps to work backwards: what structure will make it easiest to print off the desired output? Some thought, perhaps some trial-and-error, and the answer emerges: a hash of arrays (HoA):

( foo => [ foo_1-a, foo_2-b, foo_3-b, foo_4-b ], bar => [ bar_1-a, bar_2-a, bar_3-b, bar_4-a, bar_5_b ], )

On HoAs, see perldsc. Now that you have the right data structure, the code to write and read it almost writes itself (well, kinda...):

#! perl use strict; use warnings; use Data::Dump; my %hash; while (<DATA>) { push @{ $hash{$2} }, $1 if / ( ([^\s_]+) _ [^\s-]+ - \S+ ) /x; } print "\nData structure (HoA):\n"; dd \%hash; print "\nOutput:\n"; for (sort keys %hash) { my $array_ref = $hash{$_}; print $_, ' ', scalar @$array_ref, ' ', join(' ', @$array_ref), "\ +n"; } __DATA__ foo_1-a foo_2-b foo_3-b foo_4-b bar_1-a bar_2-a bar_3-b bar_4-a bar_5-b

I’ve put in code to dump the hash, so you can clearly see the intermediate point at which the data structure has been populated. Here is the output:

12:43 >perl 1139_SoPW.pl Data structure (HoA): { bar => ["bar_1-a", "bar_2-a", "bar_3-b", "bar_4-a", "bar_5-b"], foo => ["foo_1-a", "foo_2-b", "foo_3-b", "foo_4-b"], } Output: bar 5 bar_1-a bar_2-a bar_3-b bar_4-a bar_5-b foo 4 foo_1-a foo_2-b foo_3-b foo_4-b 12:43 >

Hope that helps,

Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,


In reply to Re: Create a hash for each unique captured regex variable by Athanasius
in thread Create a hash for each unique captured regex variable by hkates

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.