This seems to achieve your requirements. You may need to tighten the regexs depending on how faithful the real data is to the sample provided.

Code

#! perl -slw use strict; use Data::Dumper; my %corps; while (<DATA>) { chomp; my $name = $_; $corps{$name} = []; <DATA>; while(<DATA>) { chomp; last if /^\s*$/; my @stuff = /^\d\s+((\d{3})\d+)\s+(\d+)/; push @{$corps{$name}}, {bid=>$stuff[0], eid=>$stuff[2], prefix +=>$stuff[1]}; } } print Dumper \%corps; __DATA__ ABC corp. 1 1002003 1002007 some text here 2 1011999 1012020 other text here XYZ Ltd. 1 2031994 2032071 some text here 2 2021996 2022030 other text here 1 1871995 1872031 some text here 2 1772004 1772021 other text here PQR corp. 1 1072003 1072007 some text here 2 2011999 2012020 other text here LNM Ltd. 1 2041994 2042071 some text here 2 2051996 2052030 other text here 1 1971995 1972031 some text here 2 1472004 1472021 other text here

Output

c:\test>226013 $VAR1 = { 'XYZ Ltd.' => [ { 'bid' => '2031994', 'prefix' => '203', 'eid' => '2032071' }, { 'bid' => '2021996', 'prefix' => '202', 'eid' => '2022030' }, { 'bid' => '1871995', 'prefix' => '187', 'eid' => '1872031' }, { 'bid' => '1772004', 'prefix' => '177', 'eid' => '1772021' } ], 'ABC corp.' => [ { 'bid' => '1002003', 'prefix' => '100', 'eid' => '1002007' }, { 'bid' => '1011999', 'prefix' => '101', 'eid' => '1012020' } ], 'LNM Ltd.' => [ { 'bid' => '2041994', 'prefix' => '204', 'eid' => '2042071' }, { 'bid' => '2051996', 'prefix' => '205', 'eid' => '2052030' }, { 'bid' => '1971995', 'prefix' => '197', 'eid' => '1972031' }, { 'bid' => '1472004', 'prefix' => '147', 'eid' => '1472021' } ], 'PQR corp.' => [ { 'bid' => '1072003', 'prefix' => '107', 'eid' => '1072007' }, { 'bid' => '2011999', 'prefix' => '201', 'eid' => '2012020' } ] }; c:\test>

Examine what is said, not who speaks.

The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.


In reply to Re: Records question by BrowserUk
in thread Records question by dave8775

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.