Build a profile of the line and then use substr to extract the relevant data fields.

#!perl use strict; use Data::Dump 'pp'; my @fields = ( 'PERMIT NO', 'VALUATION', 'PARCEL NO', 'ISSUED DATE', 'LOCATION', 'OWNER','CONTRACTOR'); my $match = join '|',map{ quotemeta }@fields; my $re = qr/$match/; #print $re."\n"; # %posn is fieldname => [startposn,length] my %posn = (); # fields on one line my @cols = (); my $rec = {}; # one record my %data = (); # many records while (<DATA>){ # start new record if (/PERMIT NO/){ add_record($rec); $rec = {}; } # build profile if (/:/){ @cols = (); while (/($re):/g){ push @cols,$1; $posn{$1}[0] = pos(); $posn{$1}[1] = 0; # to end of line # calc previous col length if (@cols>1){ my $prev = $cols[-2]; $posn{$prev}[1] = pos()-length($1)-$posn{$prev}[0]-1; } } } # extract data for my $c (@cols){ my $posn = $posn{$c}[0]; my $len = $posn{$c}[1]; my $str = ($len) ? substr $_,$posn,$len : substr $_,$posn; $str =~ s/^\s+|\s+$//g; #print "$c = '$str' $posn $len\n"; $rec->{$c} .= ' ' if exists $rec->{$c} && $str; $rec->{$c} .= $str; } } # last record add_record($rec); pp \%data; # add record to %data sub add_record { my $rec = shift; my $no = $rec->{'PERMIT NO'}; $data{$no} = $rec if ($no); }; # debug #pp \%posn; __DATA__

Update ; removed length for last field

poj

In reply to Re: Line concat to proper hash key by poj
in thread Line concat to proper hash key by chimiXchanga

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.