comment on

Thank you for posting that. I believe I am getting my head wrapped around it. Basically it seems that the XML that I am feeding into XML::LibXML->load_xml has a namespace in it and I need to declare what the namespace is, otherwise the comparison is vs the null namespace, which is why I wasn't getting a return I expected. If there was no namespace I don't need to declare it as comparing vs the null namespace would return correctly.

Your code does do what I want it to do, but I think for the sake of future usability I will want to pull the value that I am going to be using for the pfx for name space out of the XML itself, push it into a variable, and use that variable to declare the namespace.

During testing I have found that having whitespace before/after the "s in the namespace declaration causes issues. IE <TMSMessage xmlns=" http://xmlns.sony.net/d-cinema/tms-api/v1"> or <TMSMessage xmlns="http://xmlns.sony.net/d-cinema/tms-api/v1 ">. However this should be caught by the server sending the XML to us and I have not found any test data of this type, so currently I will not error handle for it. A few test samples of data show that our logs sometimes have a blank namespace declaration IE: <TMSMessage xmlns=""> (this throws a xpath error if I try to process it so I error out on finding this) or no namespace declaration IE: <TMSMessage> (this would not process with the code you provided as there is no namespace, I have added a if loop to process this instead). I have edited the code to catch this and either process or error out.

Updated code is below. It likely could still use some polishing, but it catches all the test conditions I currently want.

use strict;
use warnings;
use XML::LibXML;


    ## test script to verify the way xml is parsed##
    ## $xmlfile is set to data of a format that would be expected to b
+e passed to this if used in a loop ##
    my $namespace;
    my $goodnamespace=1;
    my $namespacedeclared=0;
    my $xmlfile = '<?xml version="1.0" encoding="UTF-8"?><TMSMessage x
+mlns="http://xmlns.sony.net/d-cinema/tms-api/v1"><MessageHeader><Id>-
+1</Id><Type>ListPerformances</Type>
...etc
</TMSMessage>';
    if ($xmlfile =~ m/xmlns\=/) {                                     
+                   ## if a namespace is declared we want to extract i
+t ##
        $namespacedeclared=1;                                         
+               ## since we have a namespace, set the namespacedelcare
+ to 1 ##
        print "There is a namespace declared in the xmlfile.  Extracti
+ng it.\n";    
        my @namespacesplit=split(/xmlns\=\"/,$xmlfile);               
+                 ## split the xmlfile on the xmlns declaration.  incl
+ude the " so that we can discard this.  ##
        my $tempnamespace=$namespacesplit[1];                         
+               ## $namespacesplit[0] will contain everything before t
+he declaration.  [1] will contain the namespace at the beginning.  I 
+like declare a new temp variable to use before feeding into the next 
+split.##
        my @finalnamespacesplit=split(/">/,$tempnamespace);           
+                 ## split our return again so that everything after t
+he namespace gets split from the namespace ##
        $namespace=$finalnamespacesplit[0];                           
+                 ## finalnamespacesplit[0] should have the name space
+.  I have found that if the namespace is a "space" or a "tab" if it i
+s declared but is nothing then the namespace won't work in processing
+ the xml as it is invalid.  Will test for this ##
        my $lengthtest=$namespace;
        $lengthtest =~ s/\s+//g;                                      
+              ## remove all spaces ##
        my $testlength = length ($lengthtest);                        
+                ## check the length of the namespace once all spaces 
+removed ##
        if ($testlength) {                                            
+                ## If the length of the string after removing spaces 
+is greater than one, then we likely have a good namespace catches err
+or if the namespace is just a combination of spaces/tabs##
            $goodnamespace=1;                                         
+               
        }else{                                                        
+                ## if stringlength is null then there were no charact
+ers in the namespace ##
            $goodnamespace=0;
        }
    }else{
        print "There is not a namespace delcared in the xmlfile.\n";  
+              ## edit when added to final program ##
    }
    
    if ($namespacedeclared==1 && $goodnamespace==1) {                 
+               ## if a namespace is declared and if that namespace is
+ good ##
        my $doc = XML::LibXML->load_xml(string => $xmlfile);
        print "Setting the namespace. And processing the XML.\n";     
+               ## edit when added to final program ##
        my $xc = XML::LibXML::XPathContext->new( $doc->documentElement
+() );
        $xc->registerNs( pfx => $namespace );                         
+               ## sets the namespace prefix to $namespace ##
        my @events = $xc->findnodes('//pfx:EventInfo');               
+                                             
        foreach my $event (@events) {
            my $id = $xc->findvalue('pfx:EventId', $event);
            print "EventId: $id\n";
            my @packs = $xc->findnodes('pfx:PreshowPackList/pfx:Presho
+wPack', $event);
            if ( not @packs ) {
                print "No Packs\n";
                next;
            }
            for my $pack (@packs) {
                my $pack_id = $xc->findvalue('pfx:PackId', $pack) || "
+No PackId";
                print "PackId: $pack_id\n";
            }
        }
    }elsif (!$namespace && $namespacedeclared==0) {                   
+                         ## if namespace is null and there was not a 
+namespace declared ##
        my $doc = XML::LibXML->load_xml(string => $xmlfile);
        print "No namespace set.  Processing XML without.\n";         
+                       ## edit when added to final program ##
        my @events = $doc->findnodes('//EventInfo');
        foreach my $event (@events) {
            my $id = $event->findvalue('./EventId');
            print "EventId: $id\n";
            my @packs = $event->findnodes('./PreshowPackList/PreshowPa
+ck');
            if ( not @packs ) {
                print "No Packs\n";
                next;
            }
            for my $pack (@packs) {
                #print $pack;
                my $pack_id = $pack->findvalue('PackId') || "No PackId
+";
                print "PackId: $pack_id\n";
            }
        }
    }elsif ($namespacedeclared==1 && $goodnamespace==0) {             
+                                                       ## namespace w
+as declared, but was not a valid namespace, just combination of space
+s/tabs ##
        print "There was a namepace set in the XML file but it is a nu
+ll value.  Unable to continue with current code.\n";  ## edit when ad
+ded to final program ##
        #return (0);                    ## Will add back when pushed i
+nto sub in main program ##
    }
    #return(1);                            ## will add back when pushe
+d into main program ##
[download]

In reply to Re^2: XML Parsing Problems by taralon
in thread XML Parsing Problems by taralon

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.