in reply to Re: Extracting information from a Text widget
in thread Extracting information from a Text widget

How could this be modified to account for other text? For example:
Shapes { Shape=Circle { name=joe diameter=3 color=blue { random stuff { further depth } } } }
I think this just comes down to creating the right regex, but nothing I've tried is able to ignore the rest of the text. I still just want the circle and it's name. thx

Replies are listed 'Best First'.
Re^3: Extracting information from a Text widget
by BrowserUk (Patriarch) on Jul 06, 2007 at 05:02 UTC

    This will parse the format you've indicated. It's pretty flexible about whitespace and accepts comments in most of the obvious places.

    It produces the HoAoHs you see at the bottom of the page which gives you access to everything you need. As requested above, it parses, but ignores nested blocks below the first level.

    It should deal with any number of Shape types and any number of key/value attribute pairs. Both key and value must not contain spaces.

    #! perl -slw use strict; use re qw[ eval ]; use Data::Dump; my %shapes; my( $type, $key, $value ); my $reComment = qr[ \# .*? \n \s* ]x; my $reWS = qr[ \s* | $reComment ]smx; my $reNestedBlock; $reNestedBlock = qr[ \{ (?> [^{}]+ | (??{ $reNestedBlock }) )+ \} $reWS ]smx; my $reAttribs = qr[ $reComment | ( (?> \w+ ) ) (?{ $key = $^N }) \s* = \s* ( \S+ ) (?{ $value = $^N }) (?{ $shapes{ $type }[-1]{ $key } = $value }) $reWS ]smx; my $reShape = qr[ $reComment | Shape \s* = \s* ( (?> \w+ ) ) (?{ push @{ $shapes{ $type = $^N } }, {} }) $reWS+ \{ $reWS (?> $reAttribs+ ) $reNestedBlock? \} $reWS ]smx; my $reShapes = qr[ Shapes $reWS \{ $reWS $reShape+ \} ]smx; do{ local $/; <DATA> } =~ $reShapes or die "Failed to match!"; print dump \%shapes; __DATA__ Shapes { Shape=Circle # A comment { name= joe diameter=3 color=blue # A comment } ## Another one Shape=Triangle{ name=Equilateral sides=3,3,3 color=green } Shape=Triangle{ name = Right sides = 3,4,5 color = green } Shape=Circle { name=steve diameter=5 color=red } Shape=Square { name=steve2 size=5 color=black } Shape = Square { name=fred2 size=3 color=yellow { random stuff { {{{{{{{{{{{{{{{fred}}}}}}}}}}}}}}} } } } Shape=Circle { # A comment name = fred # a comment diameter = 3 color = blue { random stuff { further depth } } } }

    Output:

    C:\test>624564 { Circle => [ { color => "blue", diameter => 3, name => "joe" }, { color => "red", diameter => 5, name => "steve" }, { color => "blue", diameter => 3, name => "fred" }, ], Square => [ { color => "black", name => "steve2", size => 5 }, { color => "yellow", name => "fred2", size => 3 }, ], Triangle => [ { color => "green", name => "Equilateral", sides => "3 +,3,3" }, { color => "green", name => "Right", sides => "3,4,5" +}, ], }

    Two caveats:

    1. When it fails, it doesn't tell you why or where.

      That's why I added the comment facility, it makes it resonably quick to locate errors.

    2. I'm not offering to maintain or extend it for you.

      It's not horribly complicated, and with a little study you should be able to work out how to extend it if you need to.

      I don't mind answering questions to help you, but do not expect to be able to ask for feature XYZ and just get code back. Ask for help in implemeting XYZ, post your attempt, and you'll probably get it.

    Bear those in mind before you decide to use the above code.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.