in reply to Re^2: problems splitting ugly input data
in thread problems splitting ugly input data

Would it be OK if I up the ante a bit? The tags are really not so well structured. They are things like HOSTNAME, CONTACT, ....

Sure:

#! perl -slw use strict; use Data::Dump qw[ pp ]; $Data::Dump::WIDTH = 50; my $reTags = join '|', map quotemeta, qw[ HOSTNAME CONTACT TAG1 TAG2 TAG3 TAG4 ]; $reTags = qr[$reTags]; my %hash = do{ local $/; <DATA> } =~ m[($reTags)=\s+(.+?)(?=$reTags|\Z)]gsm; pp \%hash; __DATA__ TAG1= data TAG2= more data HOSTNAME= fred TAG3= even more data that sometimes has = and runs on to more than one line CONTACT= Wiley Coyote Hiesenberg Road The Desert TAG4= still more

Produces:

c:\test>junk15 { CONTACT => "Wiley Coyote\nHiesenberg Road\nThe Desert\n", HOSTNAME => "fred\n", TAG1 => "data\n", TAG2 => "more data\n", TAG3 => "even more data that sometimes has = and\nruns on to mor +e\nthan one line\n", TAG4 => "still more", }

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^4: problems splitting ugly input data
by Anonymous Monk on Dec 23, 2010 at 03:34 UTC
    Thank you, again. Those are the kinds of things I was trying to come up with on my own. So I learned (several) something(s) in this.

    Cheers.