pineapples has asked for the wisdom of the Perl Monks concerning the following question:

Perl MONKs, I am trying to use Perl and Lib::XML to conditionally replace XML tags but to preserve the order of the XML input file(which is why I cannot use XML:Simple) Consider the following XML
<?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE POWERMART SYSTEM "powrmart.dtd"> <POWERMART CREATION_DATE="10/19/2010 07:15:27" REPOSITORY_VERSION="179 +.88"> <REPOSITORY NAME="IT_REP" VERSION="179" CODEPAGE="Latin1" DATABASETYPE +="DB2"> <FOLDER NAME="CADM" GROUP="" OWNER="Administrator" SHARED="NOTSHARED" +DESCRIPTION="" PERMISSIONS="rwx---r-x" UUID="ad49ad64-d86a-4823-9c04- +2d881fe79240"> <MAPPING DESCRIPTION ="" ISVALID ="YES" NAME ="m_EQ_Hist_Alloc_Att +rib_build" OBJECTVERSION ="1" VERSIONNUMBER ="1"> <TRANSFORMATION DESCRIPTION ="" NAME ="expr_seq_alloc_attrib_i +d_RPT" OBJECTVERSION ="1" REUSABLE ="NO" TYPE ="Expression" VERSIONNU +MBER ="1"> <TRANSFORMFIELD DATATYPE ="bigint" DEFAULTVALUE ="" DESCRI +PTION ="" NAME ="NEXTVAL" PICTURETEXT ="" PORTTYPE ="INPUT" PRECISION + ="19" SCALE ="0"/> <TRANSFORMFIELD DATATYPE ="bigint" DEFAULTVALUE ="" DESCRI +PTION ="" EXPRESSION ="IIF(SEQ_START_VAL= 0, (IIF(ISNULL(:LKP.LKP_T_A +lloc_Attrib_Seq(1)), 0, :LKP.LKP_T_Alloc_Attrib_Seq(1))), SEQ_START_V +AL)" EXPRESSIONTYPE ="GENERAL" NAME ="SEQ_START_VAL" PICTURETEXT ="" +PORTTYPE ="LOCAL VARIABLE" PRECISION ="19" SCALE ="0"/> <TRANSFORMFIELD DATATYPE ="bigint" DEFAULTVALUE ="ERROR(&a +pos;transformation error&apos;)" DESCRIPTION ="" EXPRESSION ="SEQ_STA +RT_VAL+NEXTVAL" EXPRESSIONTYPE ="GENERAL" NAME ="OUT_NEXTVAL" PICTURE +TEXT ="" PORTTYPE ="OUTPUT" PRECISION ="19" SCALE ="0"/> <TABLEATTRIBUTE NAME ="Tracing Level" VALUE ="Normal"/> <METADATAEXTENSION DATATYPE ="STRING" DESCRIPTION ="" DOMA +INNAME ="User Defined Metadata Domain" ISCLIENTEDITABLE ="YES" ISCLIE +NTVISIBLE ="YES" ISREUSABLE ="NO" ISSHAREREAD ="NO" ISSHAREWRITE ="NO +" MAXLENGTH ="256" NAME ="Blank" VALUE ="&apos;&apos;" VENDORNAME ="I +NFORMATICA"/> </TRANSFORMATION> <TRANSFORMATION DESCRIPTION ="" NAME ="expr_seq_alloc_attrib_i +d_REV_HIST" OBJECTVERSION ="1" REUSABLE ="NO" TYPE ="Sequence" VERSIO +NNUMBER ="1"> <TRANSFORMFIELD DATATYPE ="bigint" DEFAULTVALUE ="" DESCRI +PTION ="" NAME ="NEXTVAL" PICTURETEXT ="" PORTTYPE ="INPUT" PRECISION + ="19" SCALE ="0"/> <TRANSFORMFIELD DATATYPE ="bigint" DEFAULTVALUE ="" DESCRI +PTION ="" EXPRESSION ="IIF(SEQ_START_VAL= 0, (IIF(ISNULL(:LKP.LKP_T_A +lloc_Attrib_Seq(1)), 0, :LKP.LKP_T_Alloc_Attrib_Seq(1))), SEQ_START_V +AL)" EXPRESSIONTYPE ="GENERAL" NAME ="SEQ_START_VAL" PICTURETEXT ="" +PORTTYPE ="LOCAL VARIABLE" PRECISION ="19" SCALE ="0"/> <TRANSFORMFIELD DATATYPE ="bigint" DEFAULTVALUE ="ERROR(&a +pos;transformation error&apos;)" DESCRIPTION ="" EXPRESSION ="SEQ_STA +RT_VAL+NEXTVAL" EXPRESSIONTYPE ="GENERAL" NAME ="OUT_NEXTVAL" PICTURE +TEXT ="" PORTTYPE ="OUTPUT" PRECISION ="19" SCALE ="0"/> <TABLEATTRIBUTE NAME ="Tracing Level" VALUE ="Normal"/> <METADATAEXTENSION DATATYPE ="STRING" DESCRIPTION ="" DOMA +INNAME ="User Defined Metadata Domain" ISCLIENTEDITABLE ="YES" ISCLIE +NTVISIBLE ="YES" ISREUSABLE ="NO" ISSHAREREAD ="NO" ISSHAREWRITE ="NO +" MAXLENGTH ="256" NAME ="Blank" VALUE ="&apos;&apos;" VENDORNAME ="I +NFORMATICA"/> </TRANSFORMATION> </MAPPING> </FOLDER> </REPOSITORY> </POWERMART>
I want to change the transformations where the TYPE ="Sequence" and change all DATATYPES in those transformations to "integer" ie the second set of transformation tags. So the above XML should be :-
<?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE POWERMART SYSTEM "powrmart.dtd"> <POWERMART CREATION_DATE="10/19/2010 07:15:27" REPOSITORY_VERSION="179 +.88"> <REPOSITORY NAME="IT_REP" VERSION="179" CODEPAGE="Latin1" DATABASETYPE +="DB2"> <FOLDER NAME="CADM" GROUP="" OWNER="Administrator" SHARED="NOTSHARED" +DESCRIPTION="" PERMISSIONS="rwx---r-x" UUID="ad49ad64-d86a-4823-9c04- +2d881fe79240"> <MAPPING DESCRIPTION ="" ISVALID ="YES" NAME ="m_EQ_Hist_Alloc_Att +rib_build" OBJECTVERSION ="1" VERSIONNUMBER ="1"> <TRANSFORMATION DESCRIPTION ="" NAME ="expr_seq_alloc_attrib_i +d_RPT" OBJECTVERSION ="1" REUSABLE ="NO" TYPE ="Expression" VERSIONNU +MBER ="1"> <TRANSFORMFIELD DATATYPE ="bigint" DEFAULTVALUE ="" DESCRI +PTION ="" NAME ="NEXTVAL" PICTURETEXT ="" PORTTYPE ="INPUT" PRECISION + ="19" SCALE ="0"/> <TRANSFORMFIELD DATATYPE ="bigint" DEFAULTVALUE ="" DESCRI +PTION ="" EXPRESSION ="IIF(SEQ_START_VAL= 0, (IIF(ISNULL(:LKP.LKP_T_A +lloc_Attrib_Seq(1)), 0, :LKP.LKP_T_Alloc_Attrib_Seq(1))), SEQ_START_V +AL)" EXPRESSIONTYPE ="GENERAL" NAME ="SEQ_START_VAL" PICTURETEXT ="" +PORTTYPE ="LOCAL VARIABLE" PRECISION ="19" SCALE ="0"/> <TRANSFORMFIELD DATATYPE ="bigint" DEFAULTVALUE ="ERROR(&a +pos;transformation error&apos;)" DESCRIPTION ="" EXPRESSION ="SEQ_STA +RT_VAL+NEXTVAL" EXPRESSIONTYPE ="GENERAL" NAME ="OUT_NEXTVAL" PICTURE +TEXT ="" PORTTYPE ="OUTPUT" PRECISION ="19" SCALE ="0"/> <TABLEATTRIBUTE NAME ="Tracing Level" VALUE ="Normal"/> <METADATAEXTENSION DATATYPE ="STRING" DESCRIPTION ="" DOMA +INNAME ="User Defined Metadata Domain" ISCLIENTEDITABLE ="YES" ISCLIE +NTVISIBLE ="YES" ISREUSABLE ="NO" ISSHAREREAD ="NO" ISSHAREWRITE ="NO +" MAXLENGTH ="256" NAME ="Blank" VALUE ="&apos;&apos;" VENDORNAME ="I +NFORMATICA"/> </TRANSFORMATION> <TRANSFORMATION DESCRIPTION ="" NAME ="expr_seq_alloc_attrib_i +d_REV_HIST" OBJECTVERSION ="1" REUSABLE ="NO" TYPE ="Sequence" VERSIO +NNUMBER ="1"> <TRANSFORMFIELD DATATYPE ="integer" DEFAULTVALUE ="" DESCR +IPTION ="" NAME ="NEXTVAL" PICTURETEXT ="" PORTTYPE ="INPUT" PRECISIO +N ="19" SCALE ="0"/> <TRANSFORMFIELD DATATYPE ="integer" DEFAULTVALUE ="" DESCR +IPTION ="" EXPRESSION ="IIF(SEQ_START_VAL= 0, (IIF(ISNULL(:LKP.LKP_T_ +Alloc_Attrib_Seq(1)), 0, :LKP.LKP_T_Alloc_Attrib_Seq(1))), SEQ_START_ +VAL)" EXPRESSIONTYPE ="GENERAL" NAME ="SEQ_START_VAL" PICTURETEXT ="" + PORTTYPE ="LOCAL VARIABLE" PRECISION ="19" SCALE ="0"/> <TRANSFORMFIELD DATATYPE ="integer" DEFAULTVALUE ="ERROR(& +apos;transformation error&apos;)" DESCRIPTION ="" EXPRESSION ="SEQ_ST +ART_VAL+NEXTVAL" EXPRESSIONTYPE ="GENERAL" NAME ="OUT_NEXTVAL" PICTUR +ETEXT ="" PORTTYPE ="OUTPUT" PRECISION ="19" SCALE ="0"/> <TABLEATTRIBUTE NAME ="Tracing Level" VALUE ="Normal"/> <METADATAEXTENSION DATATYPE ="STRING" DESCRIPTION ="" DOMA +INNAME ="User Defined Metadata Domain" ISCLIENTEDITABLE ="YES" ISCLIE +NTVISIBLE ="YES" ISREUSABLE ="NO" ISSHAREREAD ="NO" ISSHAREWRITE ="NO +" MAXLENGTH ="256" NAME ="Blank" VALUE ="&apos;&apos;" VENDORNAME ="I +NFORMATICA"/> </TRANSFORMATION> </MAPPING> </FOLDER> </REPOSITORY> </POWERMART>
I came up with this code below. I seem to be able to identify the Transformation Type, but after that I cannot seem to identify the child nodes to identify the datatype. I am a beginner to Perl so any help would be greatly achieved, even if it looks obvious. Thank you very much!
my $parser = XML::LibXML->new(); my $tree = $parser->parse_file($inputfile); my $root = $tree->getDocumentElement; my $sequence = 'Sequence'; my $querysequence = '//REPOSITORY/FOLDER/MAPPING/TRANSFORMATION/@TYP +E'; my @typenodes = $root->findnodes($querysequence); foreach my $sequencetransform (@typenodes) { my $literal = $sequencetransform->to_literal; #print $literal; if ( $literal == 'Sequence' ) { print $literal; my @datatypenodes = $sequencetransform->findnodes('TRANSFORMFI +ELD/@DATATYPE'); foreach my $sequencetransform_data (@datatypenodes) { print "am i here?"; print Dumper $sequencetransform->findnodes('TRANSFORMFIELD/@DA +TATYPE'); #my $literal1 = $sequencetransform_data->to_literal; print $literal1; if ( $literal1 == 'bigint' ) { $sequencetransform_data->setValue('integer'); } } } }

Replies are listed 'Best First'.
Re: Perl and Lib::XML usage
by ikegami (Patriarch) on Oct 25, 2010 at 21:43 UTC

    You are searching for a subset of

    .../MAPPING/TRANSFORMATION/@TYPE/TRANSFORMFIELD/@TYPE

    An attribute cannot have a child element, so that can't possibly match. Solution:

    for ($root->findnodes( '/POWERMART/REPOSITORY/FOLDER/MAPPING' . '/TRANSFORMATION[@TYPE="Sequence"]' . '/TRANSFORMFIELD[@TYPE="bigint"]' )) { $_->setAttribute('TYPE', 'integer'); }

    Square brackets specify which nodes to filter through, like a SELECT clause in SQL.

      Thanks a bunch Perl Monks. This particular solution work a treat. But thanks to all who contributed, each one of you had valid comments that I can learn from.
Re: Perl and Lib::XML usage
by chromatic (Archbishop) on Oct 26, 2010 at 06:50 UTC
    if ( $literal == 'Sequence' )

    String equality (eq) will work much better.

      Just want to add a line. The expression if ( $literal == 'Sequence' ) will always return true if $literal can not be converted to a numeric value. Both the operands will be converted to 0 (due to '==' operator) and a warning will be given if use warnings is used.
Re: Perl and Lib::XML usage
by NetWallah (Canon) on Oct 25, 2010 at 19:48 UTC
    Your XML tag:
    <MAPPING DESCRIPTION ="" ISVALID ="YES" NAME ="m_EQ_Hist_Alloc_Attrib +_build" OBJECTVERSION ="1" VERSIONNUMBER ="1">
    seems not to be closed. Here is the error I get:
    Entity: line 21: parser error : Opening and ending tag mismatch: MAPPI +NG line 0 and FOLDER </FOLDER> ^ at libxmltest.pl line ...

         Syntactic sugar causes cancer of the semicolon.        --Alan Perlis

      Apologies I have updated the XML
Re: Perl and Lib::XML usage
by choroba (Cardinal) on Oct 26, 2010 at 08:17 UTC
    Things get much simpler when using XML::XSH2. In fact, you would just need:
    open 867322.xml ; for /POWERMART/REPOSITORY/FOLDER/MAPPING/TRANSFORMATION[@TYPE="Sequenc +e"]/TRANSFORMFIELD set @DATATYPE "integer" ; save :b ;
Re: Perl and Lib::XML usage
by aquarium (Curate) on Oct 25, 2010 at 23:09 UTC
    almost feels like dusting off a xsl processor to do this, instead of perl code.
    the hardest line to type correctly is: stty erase ^H