This isn't a quarter as elegant as the already posted solutions, but I take the shame of posting, since it might be relevant anyway. :-)

This do a recursive traversal so it allows hierarchical variable specifications.

(I have to learn to write better regexps. They are so damn elegant when they parse a large chunk in one go...)

I didn't understand how to do the prefix things either (all uc chars except the first, as BrowserUK did it?)

use strict; use warnings; use Carp; use Dumpvalue; my $d = new Dumpvalue; $d->compactDump(1); my($h) = {}; my(@lines) = <DATA>; parse_var_spec($h, \@lines); $d->dumpValue($h); # This is used recursively. sub parse_var_spec { my($hash, $lines, $in_sub_parse) = @_; # This is probably very ineffective, so rewrite to send an offset # along, instead. :-) while(scalar(@$lines)) { my $l = shift @$lines; next if $l =~ m"^\s*#"; # Comment. next if $l =~ /^\s*$/; # Empty line. # Handle return if parsed a subdef: if ($in_sub_parse && $l =~ s/^\s*>//) { # Value will come directly after this. unshift @$lines, $l; return; } if ($l =~ s/^\s*(float|string)\s+([a-zA-Z][a-zA-Z0-9_]*)//) { # Got type and var name. my($type) = $1; my($var) = $2; $hash->{$var}->{type} = $type; # Are there subdata? if ($l =~ /^\s*$/) { $l = $lines->[0]; croak "Bad def of $type '$var'" if !($l =~ s/^\s*<\ +s*//); $lines->[0] = $l; # Put back without '<'. my(%subs); # Recursive call that parse a bit different: parse_var_spec(\%subs, $lines, 1); # Setup sub-values: # Will it always be a 'UI' prefix? Should you look at +the # start of the variables? # Ah, do the details as an exercise. $hash->{$var}->{ui} = \%subs; $l = shift @$lines; } # Now, is it just a value? if ($l =~ /^\s*=\s*(.*)\s*;\s*$/) { my($val) = $1; if ($type eq 'string') { if ($val =~ /^"(.*)"$/) { # XXXX Extra parsing of string here for \n, et +c. $hash->{$var}->{value} = $1; } else { croak "Bad value '$val' for string '$var'"; } } elsif ($type eq 'float') { # XXXXX Parse out float value from $val better # than this :-) $val =~ s/f$//; $hash->{$var}->{value} = $val + 0.0; } else { # XXXXX etc. croak "Unknown type $type for var '$var'"; } } else { croak "Couldn't parse value from '$var', value '$l'"; } } } } __DATA__ string UIWidget = "slider"; float foohoo = 0.4532; string bahoo < string SUBtjo = "gznk"; string SUBhej = "sassa rassa"; float SUBba < float XXfoo = 4711f; string XXallan= "trutt trutt"; > = -122.22f; float SUBbaa = 23.23f; > = "hejsvjs"; string foo = "barf"; float myVarA < float UIMin = 1; float UIMax = 0; float UIStep = .001; string UIWidget = "slider"; > = 0.5f; float myVarB = 1.0; float4 myVarC = {1,0,0,1};

The result of the run is:

'UIWidget' => HASH(0x8148b44) 'type' => 'string', 'value' => 'slider' 'bahoo' => HASH(0x81c994c) 'type' => 'string' 'ui' => HASH(0x81c96ac) 'SUBba' => HASH(0x81c9e44) 'type' => 'float' 'ui' => HASH(0x81c9ce8) 'XXallan' => HASH(0x81ca0fc) 'type' => 'string', 'value' => 'trutt trutt' 'XXfoo' => HASH(0x81c9b8c) 'type' => 'float', 'value' => 4711 'value' => '-122.22' 'SUBbaa' => HASH(0x81c9bbc) 'type' => 'float', 'value' => 23.23 'SUBhej' => HASH(0x81c9b68) 'type' => 'string', 'value' => 'sassa rassa' 'SUBtjo' => HASH(0x81c9b50) 'type' => 'string', 'value' => 'gznk' 'value' => 'hejsvjs' 'foo' => HASH(0x81c9be0) 'type' => 'string', 'value' => 'barf' 'foohoo' => HASH(0x81c9640) 'type' => 'float', 'value' => 0.4532 'myVarA' => HASH(0x81c9c04) 'type' => 'float' 'ui' => HASH(0x81c9964) 'UIMax' => HASH(0x81cee38) 'type' => 'float', 'value' => 0 'UIMin' => HASH(0x81c9c1c) 'type' => 'float', 'value' => 1 'UIStep' => HASH(0x81cee5c) 'type' => 'float', 'value' => 0.001 'UIWidget' => HASH(0x81cee80) 'type' => 'string', 'value' => 'slider' 'value' => 0.5 'myVarB' => HASH(0x81c9c64) 'type' => 'float', 'value' => 1

In reply to Re: Tricky Parsing Ko'an by BerntB
in thread Tricky Parsing Ko'an by jmmistrot

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.