comment on

This isn't a quarter as elegant as the already posted solutions, but I take the shame of posting, since it might be relevant anyway. :-)

This do a recursive traversal so it allows hierarchical variable specifications.

(I have to learn to write better regexps. They are so damn elegant when they parse a large chunk in one go...)

I didn't understand how to do the prefix things either (all uc chars except the first, as BrowserUK did it?)


use strict;
use warnings;

use Carp;


use Dumpvalue;
my $d = new Dumpvalue; $d->compactDump(1);

my($h) = {};
my(@lines) = <DATA>;

parse_var_spec($h, \@lines);

$d->dumpValue($h);


# This is used recursively.

sub parse_var_spec {
    my($hash, $lines, $in_sub_parse) = @_;

    # This is probably very ineffective, so rewrite to send an offset
    # along, instead. :-)
    while(scalar(@$lines)) {
        my $l = shift @$lines;

        next        if $l =~ m"^\s*#";    # Comment.
        next        if $l =~ /^\s*$/;    # Empty line.

        # Handle return if parsed a subdef:
        if ($in_sub_parse && $l =~ s/^\s*>//) {
            # Value will come directly after this.
            unshift @$lines, $l;
            return;
        }

        if ($l =~ s/^\s*(float|string)\s+([a-zA-Z][a-zA-Z0-9_]*)//) {
            # Got type and var name.
            my($type)        = $1;
            my($var)        = $2;
            $hash->{$var}->{type} = $type;

            # Are there subdata?
            if ($l =~ /^\s*$/) {
                $l            = $lines->[0];
                croak "Bad def of $type '$var'"    if !($l =~ s/^\s*<\
+s*//);
                $lines->[0] = $l;        # Put back without '<'.

                my(%subs);
                # Recursive call that parse a bit different:
                parse_var_spec(\%subs, $lines, 1);

                # Setup sub-values:
                # Will it always be a 'UI' prefix? Should you look at 
+the
                # start of the variables?
                # Ah, do the details as an exercise.

                $hash->{$var}->{ui} = \%subs;
                $l            = shift @$lines;
            }

            # Now, is it just a value?
            if ($l =~ /^\s*=\s*(.*)\s*;\s*$/) {
                my($val)    = $1;
                if ($type eq 'string') {
                    if ($val =~ /^"(.*)"$/) {
                        # XXXX Extra parsing of string here for \n, et
+c.
                        $hash->{$var}->{value} = $1;
                    } else {
                        croak "Bad value '$val' for string '$var'";
                    }
                } elsif ($type eq 'float') {
                    # XXXXX Parse out float value from $val better
                    # than this :-)
                    $val =~ s/f$//;
                    $hash->{$var}->{value} = $val + 0.0;
                } else {
                    # XXXXX etc.
                    croak "Unknown type $type for var '$var'";
                }
            } else {
                croak "Couldn't parse value from '$var', value '$l'";
            }
        }
    }
}

__DATA__
string UIWidget = "slider";
float  foohoo   = 0.4532;
string bahoo
<
  string SUBtjo   = "gznk";

  string SUBhej   = "sassa rassa";
  float SUBba
  <
    float  XXfoo  = 4711f;
    string XXallan= "trutt trutt";
  > = -122.22f;

  float  SUBbaa   = 23.23f;
> = "hejsvjs";

string foo      = "barf";


float myVarA
<
float UIMin = 1;
float UIMax = 0;
float UIStep = .001;
string UIWidget = "slider";
> =  0.5f;

float myVarB = 1.0;

float4 myVarC = {1,0,0,1};
[download]

The result of the run is:

'UIWidget' => HASH(0x8148b44)
   'type' => 'string', 'value' => 'slider'
'bahoo' => HASH(0x81c994c)
   'type' => 'string'
   'ui' => HASH(0x81c96ac)
      'SUBba' => HASH(0x81c9e44)
         'type' => 'float'
         'ui' => HASH(0x81c9ce8)
            'XXallan' => HASH(0x81ca0fc)
               'type' => 'string', 'value' => 'trutt trutt'
            'XXfoo' => HASH(0x81c9b8c)
               'type' => 'float', 'value' => 4711
         'value' => '-122.22'
      'SUBbaa' => HASH(0x81c9bbc)
         'type' => 'float', 'value' => 23.23
      'SUBhej' => HASH(0x81c9b68)
         'type' => 'string', 'value' => 'sassa rassa'
      'SUBtjo' => HASH(0x81c9b50)
         'type' => 'string', 'value' => 'gznk'
   'value' => 'hejsvjs'
'foo' => HASH(0x81c9be0)
   'type' => 'string', 'value' => 'barf'
'foohoo' => HASH(0x81c9640)
   'type' => 'float', 'value' => 0.4532
'myVarA' => HASH(0x81c9c04)
   'type' => 'float'
   'ui' => HASH(0x81c9964)
      'UIMax' => HASH(0x81cee38)
         'type' => 'float', 'value' => 0
      'UIMin' => HASH(0x81c9c1c)
         'type' => 'float', 'value' => 1
      'UIStep' => HASH(0x81cee5c)
         'type' => 'float', 'value' => 0.001
      'UIWidget' => HASH(0x81cee80)
         'type' => 'string', 'value' => 'slider'
   'value' => 0.5
'myVarB' => HASH(0x81c9c64)
   'type' => 'float', 'value' => 1
[download]

In reply to Re: Tricky Parsing Ko'an by BerntB
in thread Tricky Parsing Ko'an by jmmistrot

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.