Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi,
I'm trying to loop through an XMLed version of a play and find the speeches by one character (to be queried from a form later). I've got LibXML reading the nodes and have been trying to put the text I'm trying to hash the lines I'm finding with the scene to get:
Act 1 Scene 1
blah blah
Act 1 Scene 2
blah blah
and so on for all results so the user knows which part of the play the speech is in. My code looks like:
use strict; use warnings; use XML::LibXML; my $searchterm="THESEUS"; my $file="c:\\webroot\\dream.xml"; #Get the XML file my $parse = XML::LibXML->new; my $doc = $parse->parse_file($file); my %releventlines; #Search through and find the correct field my %act; my $scene = $doc->findvalue('//PLAY//ACT//TITLE'); foreach $scene (keys %act) { my $item = $doc->findnodes('//PLAY//SCENE//SPEECH'); #Finding the speaker my $speech = $item->findvalue('SPEAKER'); next unless $speech eq $searchterm; my $text = $item->findvalue('LINE'); $text = $act{$scene}; } print $scene . "\n";
and the XML is along the lines of:
<PLAY> <ACT><TITLE>Act 1</TITLE></ACT> <SCENE><TITLE>SCENE I. Athens. The palace of THESEUS.</TITLE> <SPEECH> <SPEAKER>THESEUS</SPEAKER> <LINE>Now, fair Hippolyta, our nuptial hour</LINE> <LINE>Draws on apace; four happy days bring in</LINE> </SPEECH>
I'd be grateful for some advice as to where I've gone wrong (and I hope not too horrifically). Thanks.

Replies are listed 'Best First'.
Re: Setting up a hash in a linear search through XML
by Fletch (Bishop) on Oct 29, 2008 at 14:15 UTC

    The most immediate problem that jumps out is that you declare but never initialize %act (while you then go on to set your iteration variable $scene on the next line with presumably what you wanted in %act) and then you attempt to iterate over its contents.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

Re: Setting up a hash in a linear search through XML
by JadeNB (Chaplain) on Oct 29, 2008 at 23:05 UTC
    Fletch has got at the heart of your problem. I'd add three things to his observation.
    • The first is that you ask where you've gone wrong, but don't include the observed output, so that it's hard to tell. (It's not even clear what the desired output should be—it seems to me that you're ignoring the text completely, and only ever dumping the scene name.)
    • The second is that you are assigning the result of ->findnodes, which returns an array, to a scalar, $item. This will put the size of the array in $item, which is almost surely not what you want.
    • At the end of your foreach loop, after assigning to $text, you immediately overwrite the value with whatever is in $act{$scene} (nothing, as Fletch observed). Did you mean the assignment to go the other way? If so, you probably really meant to push on to some sort of array reference, since otherwise you'll get only one bit of text per scene.
    I'm also confused by your XML. Will it ever contain more than one act? If so, how do you know when the act ends (since the ACT node seems to end with the title)? I'm going to assume here that you've got only one act (and one play) per XML file, and suggest something that might work. (My XPath is rusty, so take this with a healthy grain of salt.)
    for my $scene ( $doc->findnodes('/PLAY/SCENE') ) { ITEM: for my $item ( $scene->findnodes('SPEECH') ) { my $speaker = $item->findvalue('SPEAKER'); next ITEM unless $speaker eq $searchterm; push @{$act{$scene}}, [ map { $_->to_literal } $item->findno +des('LINE') ]; } }
    You could dump this with something like
    for my $scene ( keys %act ) { print "SCENE $scene:\n"; print "$searchterm:\n", join "\n", @$_ for @{$act{$scene}}; }
      Many thanks for this and to both of you for explaining where I've gone wrong so I can learn from it.
        It should be noted that, despite the errors we've pointed out, you did one thing very much right: You started with use strict; use warnings;. I'd be willing to bet that that played some role in helping you remember the my %act declaration, so already you've seen the usefulness of it.