comment on

Yuck! I thought (hoped) that this type of file format -- mixed, fixed-format records -- had died long ago; but they seem to keep reinventing it :)

For your first example, the trick is to define a regex that will match the fields in the header line:

my $reHeader = '(\b\w+\s*)?' x 10; ## Adjust the repeat value to cover
+ the maximum no of fields
[download]

and use that to construct an unpack template to parse the following values line.

This is not 'nice code', but it demostrates the technique:

#! perl -slw
use strict;
use Data::Dump qw[ pp ];

my $reHeader = '(\b\w+\s*)?' x 10;

my %data;
until( eof( DATA ) ) {
    ## Read the header line and remove the newline
    chomp( my $header = <DATA> );

    ## parse the fields using the regex, ignoring undefined fields
    my @keys = grep defined, $header =~ $reHeader;

    ## trim the trailing whitespace from the keys
    s[\s*$][] for @keys;

    ## Use the capture position arrays (@- & @+) 
    ## to work out the field widths and construct a template
    my $tmpl = join ' ', map{
        defined( $-[$_] )
        ? do{ my $n = $+[$_] - $-[$_]; "a$n" } 
        : ()
    } 1 .. $#+;

    ## read and chomp the values line
    chomp( my $vals = <DATA> );

    ## Extract the value fields using the template
    my @vals = unpack $tmpl, $vals;

    ## trim leading & trailing whitespace
    s[^\s*][],s[\s*$][] for @vals;

    ## Add the key/value pairs to the hash
    @data{ @keys } = @vals;

    ## discard the blank line between the grouped pairs of lines.
    <DATA>;
}

pp \%data; ## display the hash constructed

__DATA__
TRHYST  TROFFSETP  TROFFSETN  AWOFFSET  BQOFFSET
 2       0                     5         3

HIHYST  LOHYST  OFFSETP  OFFSETN  BQOFFSETAFR
 5       3       0                 3

CELLR     DIR     CAND   CS
LUC083A   MUTUAL  BOTH   NO
[download]

Outputs:

C:\test>junk79
{
  AWOFFSET    => 5,
  BQOFFSET    => 3,
  BQOFFSETAFR => 3,
  CAND        => "BOTH",
  CELLR       => "LUC083A",
  CS          => "NO",
  DIR         => "MUTUAL",
  HIHYST      => 5,
  LOHYST      => 3,
  OFFSETN     => "",
  OFFSETP     => 0,
  TRHYST      => 2,
  TROFFSETN   => "",
  TROFFSETP   => 0,
}
[download]

Extending that to apply it to all your other sections will require a little ingenuity and a lot of painstaking testing.

I do hope for your sake that the number and ordering of the different sections is well-defined, else you've got an even worse task on your hands.

Note:This assumes that field names do not contain spaces. If they do, you are in shit street.

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

RIP Neil Armstrong

In reply to Re^3: Reading tab/whitespace delimited text file by BrowserUk
in thread Reading tab/whitespace delimited text file by reaper9187

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.