Reading records in paragraph mode rather than line by line and pulling out all the information using a regex with look-aheads with the 0 or 1 quantifier.

use strict; use warnings; use 5.014; use Data::Dumper; open my $inFH, q{<}, \ <<EOD or die $!; // HEADER TAG // VERSION TAG TYPE VALUE1 EQUALS MAIN I am useless text CAUSE FAIL EFFECT ERROR ENDTYPE TYPE VALUE2 EQUALS MAIN I am useful test ENDTYPE TYPE VALUE3 EQUALS MAIN CAUSE DEGRADED ENDTYPE TYPE VALUE4 EQUALS MAIN EFFECT WARNING ENDTYPE EOD my $rxExtract = qr {(?xs) TYPE\s ( \S+ ) (?= .* (?: CAUSE\s ( \S+ ) ) )? (?= .* (?: EFFECT\s ( \S+ ) ) )? }; my %results; { local $/ = q{}; scalar <$inFH>; while ( <$inFH> ) { next unless m{$rxExtract}; $results{ $1 } = { CAUSE => defined $2 ? $2 : q{UNDEF}, EFFECT => defined $3 ? $3 : q{UNDEF}, }; } } say qq{$_:$results{ $_ }->{ CAUSE },$results{ $_ }->{ EFFECT }} for sort keys %results; print qq{\n}; print Data::Dumper ->new( [ \ %results ], [ qw{ *results } ] ) ->Sortkeys( 1 ) ->Dumpxs();

The results.

VALUE1:FAIL,ERROR VALUE2:UNDEF,UNDEF VALUE3:DEGRADED,UNDEF VALUE4:UNDEF,WARNING %results = ( 'VALUE1' => { 'CAUSE' => 'FAIL', 'EFFECT' => 'ERROR' }, 'VALUE2' => { 'CAUSE' => 'UNDEF', 'EFFECT' => 'UNDEF' }, 'VALUE3' => { 'CAUSE' => 'DEGRADED', 'EFFECT' => 'UNDEF' }, 'VALUE4' => { 'CAUSE' => 'UNDEF', 'EFFECT' => 'WARNING' } );

I hope this is of interest.

Cheers,

JohnGG


In reply to Re: File Parsing and Pattern Matching by johngg
in thread File Parsing and Pattern Matching by Mark.Allan

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.