You didn't clearly express what you wanted as your final structure, so I hope this helps

my %records; my $record; while (<DATA>) { chomp; my $line = $_; s/^(\d\d)//; my $subrecord_type = $1; if ($subrecord_type == 10) { s/^(\d+)\s+//; my $procedure_num = $1; $records{$procedure_num} = $record = []; } push(@$record, $line); } require Data::Dumper; print(Data::Dumper::Dumper(\%records)); __DATA__ 1000001 01.11.199600.00.00001 A1 1 SN Y 2001.11.200400098.0500073.5500083.35 5001.11.1997Professional attendance being an attendance at 5001.11.1997other than consulting rooms, by a general 5001.11.1997practitioner on not more than 1 patient output ====== $VAR1 = { '00001' => [ '1000001 01.11.199600.00.00001 A1 1 SN Y', '2001.11.200400098.0500073.5500083.35', '5001.11.1997Professional attendance being an a +ttendance at', '5001.11.1997other than consulting rooms, by a +general', '5001.11.1997practitioner on not more than 1 pa +tient', ] };

=== or maybe ===

my %records; my $record; while (<DATA>) { chomp; my $line = $_; s/^(\d\d)//; my $subrecord_type = $1; if ($subrecord_type == 10) { s/^(\d+)\s+//; my $procedure_num = $1; $records{$procedure_num} = $record = {}; s/^(\d\d\.\d\d\.\d\d\d\d)//; my $date1 = $1; s/^(\d\d\.\d\d\.\d\d\d\d)//; my $date2 = $1; my ( $unknown1, $unknown2, $unknown3, $unknown4, $unknown5, ) = split(/\s+/, $_); %$record = ( date1 => $date1, date2 => $date2, unknown1 => $unknown1, unknown2 => $unknown2, unknown3 => $unknown3, unknown4 => $unknown4, unknown5 => $unknown5, 20 => [], 30 => [], 40 => [], 50 => [], ); next; } if ($subrecord_type == 20) { s/^(\d\d\.\d\d\.\d\d\d\d)//; my $date = $1; my ( $unknown1, $unknown2, $unknown3, $unknown4, ) = split(/\./, $_); push(@{$record->{20}}, { date => $date, unknown1 => $unknown1, unknown2 => $unknown2, unknown3 => $unknown3, unknown4 => $unknown4, }); next; } if ($subrecord_type == 30) { # ... next; } if ($subrecord_type == 40) { s/^(\d\d\.\d\d\.\d\d\d\d)//; my $date = $1; push(@{$record->{40}}, { date => $date, text => $_, }); next; } if ($subrecord_type == 50) { s/^(\d\d\.\d\d\.\d\d\d\d)//; my $date = $1; push(@{$record->{50}}, { date => $date, text => $_, }); next; } } require Data::Dumper; print(Data::Dumper::Dumper(\%records)); __DATA__ 1000001 01.11.199600.00.00001 A1 1 SN Y 2001.11.200400098.0500073.5500083.35 5001.11.1997Professional attendance being an attendance at 5001.11.1997other than consulting rooms, by a general 5001.11.1997practitioner on not more than 1 patient output ====== $VAR1 = { '00001' => { 'date1' => '01.11.1996', 'date2' => '00.00.0000', 'unknown1' => '1' 'unknown2' => 'A1', 'unknown3' => '1', 'unknown4' => 'SN', 'unknown5' => 'Y', '20' => [ { 'date' => '01.11.2004', 'unknown1' => '00098' 'unknown2' => '0500073', 'unknown3' => '5500083', 'unknown4' => '35', } ], '30' => [], '40' => [], '50' => [ { 'date' => '01.11.1997' 'text' => 'Professional attendance +being an attendance at', }, { 'date' => '01.11.1997' 'text' => 'other than consulting ro +oms, by a general', }, { 'date' => '01.11.1997' 'text' => 'practitioner on not more + than 1 patient', } ], } };

In reply to Re^3: Extracting fields by ikegami
in thread Extracting fields by kerrya

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.