Felix2000 has asked for the wisdom of the Perl Monks concerning the following question:

I have some data that comes out of a program in the format as seen at the bottom. I trying to pull out the data from within the {} and hopfully build multilevel hash with "name" being the top level. I'm sure I can do it with several while and if's to test going in and out of a new set of brackets but I don't like that idea and looking for something a little cleaner and more efficient. Building the hash will be easy once I can get it all into a manageable format. Thanks for any assistance. Felix2000
----------Data Format------------- instance of Win32_LogicalDisk { FreeSpace = "114151464960"; Name = "C:"; Size = "160031014912"; }; instance of Win32_LogicalDisk { FreeSpace = "5515554816"; Name = "D:"; Size = "203921108992"; }; instance of Win32_LogicalDisk { FreeSpace = "43128733696"; Name = "H:"; Size = "400086708224"; };

Replies are listed 'Best First'.
Re: Pulling data out of { }
by BrowserUk (Patriarch) on Jan 15, 2006 at 07:06 UTC

    Also no ifs, buts, or whiles

    #! perl -slw use strict; use Data::Dumper; my %hash = map{ my( $free, $name, $size ) = m[ ^ (?= .* FreeSpace \s+ = \s+ " ( [^"]+ ) "; ) #" (?= .* Name \s+ = \s+ " ( [^"]+ ) "; ) #" (?= .* Size \s+ = \s+ " ( [^"]+ ) "; ) #" ]smx or warn "Bad record '$_'"; $name ? ( $name => { freespace => $free, size => $size } ) : (); } do{ local $/ = "\n};\n"; <DATA> }; print Dumper \%hash; __DATA__

    Given your data, produces this

    P:\test>junk $VAR1 = { 'C:' => { 'freespace' => '114151464960', 'size' => '160031014912' }, 'D:' => { 'freespace' => '5515554816', 'size' => '203921108992' }, 'H:' => { 'freespace' => '43128733696', 'size' => '400086708224' } };

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Pulling data out of { }
by ikegami (Patriarch) on Jan 15, 2006 at 06:34 UTC
    my %data; { my $in = 0; my %rec; while (<DATA>) { if (/^\s*{/) { $in = 1; next; } if (/^\s*}/) { my $name = delete($rec{Name}); $data{$name} = { %rec }; undef %rec; $in = 0; next; } if ($in) { if (/^\s*(\S+)\s*=\s*"(.*)"/) { $rec{$1} = $2; } } } } require Data::Dumper; print(Data::Dumper::Dumper(\%data));

    outputs

    $VAR1 = { 'C:' => { 'Size' => '160031014912', 'FreeSpace' => '114151464960' }, 'D:' => { 'Size' => '203921108992', 'FreeSpace' => '5515554816' }, 'H:' => { 'Size' => '400086708224', 'FreeSpace' => '43128733696' } };

    You could also use the .. operator:

      Or you could avoid ifs and whiles completely:
      use strict; use warnings; use Parse::RecDescent (); my $grammar = <<'__END_OF_GRAMMAR__'; { use strict; use warnings; sub dequote { local $_ = @_ ? $_[0] : $_; s/^"//; s/"$//; s/\\(.)/$1/sg; return $_; } } parse : record(s?) /\Z/ { [ map @$_, @{$item[1]} ] } record : 'instance' 'of' 'Win32_LogicalDisk' '{' field(s?) '}' ';' { my $name; my %record; %record = map @$_, @{$item[5]}; $name = delete($record{Name}); $name ? [ $name, \%record ] : undef } field : key '=' val ';' { [ $item[1], $item[3] ] } key : IDENT val : QSTRING IDENT : /\w+/ QSTRING : /"(?:[^\\"]|\\.)*"/ { dequote($item[1]) } # guessing. __END_OF_GRAMMAR__ my $p = Parse::RecDescent->new($grammar) or die("Bad grammar\n"); my $text; { local $/; $text = <DATA>; } my $data = $p->parse($text) or die("Bad data\n"); require Data::Dumper; print(Data::Dumper::Dumper($data));

      Same output as the program in the parent post.

        Out of interest, are you doing something unusual here that makes this grammer run so slowly?

        For example, is your dequote sub being eval'd into existance every time it is used or something similar?

        Also, is there any particular benefit in doing

        require Data::Dumper; print(Data::Dumper::Dumper($data));

        Instead of the usual use Data::Dumper/print Dumper $data?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Pulling data out of { }
by davido (Cardinal) on Jan 15, 2006 at 06:45 UTC

    Here's one way (untested tested):

    use strict; use warnings; use Data::Dumper; my %disks; { local $/ = "};"; while( my $record = <DATA> ) { chomp $record; my %attribs; foreach my $line ( split /\n/, $record ) { next unless $line =~ m/=.+;/; chomp $line; $line =~ s/^\s+//; $line =~ s/\s*;\s*$//; my( $key, $value ) = split /\s*=\s*/, $line; $value =~ s/"//g; $attribs{$key} = $value; } next if not exists $attribs{'Name'}; # Just in case. ;) $disks{ $attribs{ 'Name' } } = { %attribs }; } } print Dumper \%disks; __DATA__ instance of Win32_LogicalDisk { FreeSpace = "114151464960"; Name = "C:"; Size = "160031014912"; }; instance of Win32_LogicalDisk { FreeSpace = "5515554816"; Name = "D:"; Size = "203921108992"; }; instance of Win32_LogicalDisk { FreeSpace = "43128733696"; Name = "H:"; Size = "400086708224"; };

    Updated: Tweaked code to provide HoH rather than AoH.


    Dave

Re: Pulling data out of { }
by QM (Parson) on Jan 15, 2006 at 15:29 UTC
    If your data is that regular, you shouldn't have any problems, given the excellent replies already.

    If your data is not as regular as you indicate, I'd go with something like ikegami's Parse::RecDescent solution, as it's more robust, and easier to modify (though if you don't understand how P::R works, you shouldn't blindly adopt it).

    What I didn't see (and I'm surprised you weren't chastised for it :), is code that you've already tried. While the replies you've received are top quality, you've been deprived of the experience of learning how to get there yourself. And I wallow in complete self-interest here, because if most of the Seekers Of Perl Wisdom didn't become Givers of Perl Wisdom, Perl Monks would soon become a desert of unanswered questions, along the lines of Earl Sinclair on Dinosaurs, when he starts a call-in TV show. Something like:

    Earl: "You're on Ask A Question! What's you're question?"
    Caller: "What do they call those little plastic thingies on the end of shoelaces?"
    Earl: "Hey, that's a good question...Next caller!"

    -QM
    --
    Quantum Mechanics: The dreams stuff is made of

      I agree that it would have been nice to see what code the OP had worked on prior to posting. But you do have to give Felix2000 credit for this: The question began in the CB. I realized it would be better handled as a Seekers of Perl Wisdom post, and encouraged him to do a write-up, after reading Writeup Formatting Tips. Where credit is due is in the fact that for his very first post here at the Monastery he had the patience to read and comply with Writeup Formatting Tips before hastily posting his question.

      To Felix2000, welcome to the Monastery, and thanks for being one of the few newcomers whos first post here didn't require janitorial edits. In the future, also try to boil the code you've worked on down to a minimal length snippet and post that too, so we'll know what you've tried, and so we'll know we're assisting a learning process, which is what we love doing. Come back anytime, and often. ;)


      Dave

Re: Pulling data out of { }
by NetWallah (Canon) on Jan 15, 2006 at 17:25 UTC
    Adding my $0.02 to the well-done approaches to the problem above:

    The data looks so much like XML that I am tempted to convert it to XML, then use one of the many XML parser modules to extract from it.

    I'm too shameless, and lazy to actually develop and post code, particularly on a Sunday.

    I feel that the XML approach is scalable, and easily extendable to help parse the majority of Win32 performance info.

         You're just jealous cause the voices are only talking to me.

         No trees were killed in the sending of this message.    However, a large number of electrons were terribly inconvenienced.

Re: Pulling data out of { }
by Ace128 (Hermit) on Jan 16, 2006 at 07:07 UTC