To start, perl is not my first language, though I can get by normally... and I am an XML novice. I googled till my brain hurts and haven't found anything that's really cleared up my needs, so here I am. Thanks for any insight that can be provided.

I've been given an assignment to capture specific data out of XML files being received daily from a vendor. The vendor provided the following (I provided a few examples, there are actually several dozen XPaths they provided) as reference to what I need to extract from the XML file.

//Party[@id=//Relation[child::RelationRoleCode[@tc='37']]/@RelatedObje +ctID]/Producer/CarrierAppointment/CompanyProducerID Holding[main]/Policy/KeyedValue/KeyValue substring(//OLifE/Party[@id=//OLifE/Relation[RelationRoleCode/@tc=8]/@ +RelatedObjectID]/Person/FirstName, 1, 30) //OLifE/Party[@id=//OLifE/Relation[RelationRoleCode/@tc=8]/@RelatedObj +ectID]/Person/BirthDate //OLifE/Holding[main]/Policy/KeyedValue[KeyName = 'SponsorName']/KeyVa +lue

I was able to use XML::Parser with Data::Dumper to dump the whole file... not a lot of help, but I knew I could read a file. Then I used this snippet to get all the TAGS, but still not any help getting the data for the specified XPaths.

#!/usr/opt/perl5/bin/perl use XML::Parser; my $parser = new XML::Parser (); $parser->setHandlers ( Start => \&Start_handler, End => \&End_handler, Default => \&Default_handler ); my $filename = shift; die "Can't find '$filename': $!\n" unless -f $filename; $parser->parsefile ($filename); ### HANDLERS ### sub Start_handler { my $p = shift; my $el = shift; print "START: <$el>\n"; while (my $key = shift) { my $val = shift; print " $key = $val\n"; } print "\n"; } ### sub End_handler { my ($p,$el) = @_; print "END: </$el>\n"; } ### sub Default_handler { my ($p,$str) = @_; # print " default handler found '$str'\n"; }

Here is are snippets of a sample XML file

<?xml version="1.0"?> <TXLife xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs +d="http://www.w3.org/2001/XMLSchema" Version="2.27.00" xmlns="http:// +ACORD.org/Standards/Life/2"> <UserAuthRequest> <UserLoginName>userID</UserLoginName> <UserPswd> <CryptType>NONE</CryptType> <Pswd>!password</Pswd> </UserPswd> <VendorApp> <VendorName>Something Technologies</VendorName> <AppName>FireMeNow</AppName> </VendorApp> </UserAuthRequest> <TXLifeRequest id="TXLifeRequest_xxxxxxxx-5e38-411b-9f56-2f2c7474f5c +5" PrimaryObjectID="Holding_aa2b0594-77d6-4264-b277-0218e852cb36"> <TransRefGUID>ccb03d81-7772-4cae-bbf6-436cda593b31</TransRefGUID> <TransType tc="103">New Business Submission for a </TransType> <TransExeDate>2015-09-10</TransExeDate> <TransExeTime>11:45:29.1882187-06:00</TransExeTime> <OLifE> <Holding id="Holding_aa2b0594-77d6-4264-b277-0218e852cb36"> <HoldingTypeCode tc="2">Policy</HoldingTypeCode> <CurrencyTypeCode tc="840">USD (United States Dollar)</Currenc +yTypeCode> <Policy> <ProductType tc="10">Variable Annuity</ProductType> <ApplicationInfo> <TrackingID>243958e8-f013-4a9b-a8de-996c5f5c93ed</Tracking +ID> </ApplicationInfo> <FinancialActivity> <FinActivityType tc="7">Initial payment - This is the </Fi +nActivityType> <Payment> <SourceOfFundsTC tc="32">Retirement</SourceOfFundsTC> </Payment> </FinancialActivity> <KeyedValue> <KeyName>AccountType</KeyName> <KeyValue>Individual</KeyValue> </KeyedValue> <KeyedValue> <KeyName>CheckIndicator</KeyName> <KeyValue>No</KeyValue> </KeyedValue> <KeyedValue> <KeyName>1035ExchangeIncluded</KeyName> <KeyValue>No</KeyValue> </KeyedValue> <KeyedValue> <KeyName>SponsorName</KeyName> <KeyValue>InsCompany</KeyValue> </KeyedValue> </Policy> </Holding> <Holding id="Holding_f2cfb5bd-6009-4bf2-95a0-686cb69a3b7a"> <HoldingTypeCode tc="2">Policy</HoldingTypeCode> <CurrencyTypeCode tc="840">USD (United States Dollar)</Currenc +yTypeCode> <AssetValue>79899.00</AssetValue> <Policy CarrierPartyID="Party_2eed3423-601b-4792-95d5-75f69e75 +7e71"> <Annuity> <SurrenderCharge>0.00</SurrenderCharge> </Annuity> <KeyedValue> <KeyName>OldProductType</KeyName> <KeyValue>401K</KeyValue> </KeyedValue> </Policy> </Holding> <Party id="Party_2eed3423-601b-4792-95d5-75f69e757e71"> <PartyTypeCode tc="2">Organization</PartyTypeCode> <FullName>InsCompany</FullName> <Organization /> </Party> <Party id="Party_2d205fbf-cadd-4475-9d51-a8a8aea1c625"> <PartyTypeCode tc="1">Person</PartyTypeCode> <Person> <FirstName>TheFirstName</FirstName> <LastName>TheLastName</LastName> </Person> <Producer> <CarrierAppointment> <CompanyProducerID>459JQ</CompanyProducerID> </CarrierAppointment> </Producer> </Party> <Party id="Party_869c61bc-96f9-497d-ad5b-a4341aec5311"> <PartyTypeCode tc="1">Person</PartyTypeCode> <GovtID>123-45-6789</GovtID> <Person> <FirstName>User</FirstName> <LastName>Test</LastName> <BirthDate>1966-01-22</BirthDate> </Person> <Address id="Address_8dc316c5-bcc3-45ff-b1c1-a15bc233c9d6"> <AddressTypeCode tc="30">OLI_ADTYPE_PRIMARY</AddressTypeCode +> <AddressStateTC tc="55">OLI_USA_VA</AddressStateTC> <AddressCountryTC tc="1">United States of America</AddressCo +untryTC> </Address> <KeyedValue> <KeyName>ForeignAddressInd</KeyName> <KeyValue>no</KeyValue> </KeyedValue> <KeyedValue> <KeyName>ForeignCitizenInd</KeyName> <KeyValue>yes</KeyValue> </KeyedValue> </Party> <Relation id="Relation_0e3f2e86-3851-401e-b069-25c31eb8d989" Ori +ginatingObjectID="Holding_aa2b0594-77d6-4264-b277-0218e852cb36" Relat +edObjectID="Holding_f2cfb5bd-6009-4bf2-95a0-686cb69a3b7a"> <OriginatingObjectType tc="4">Holding</OriginatingObjectType> <RelatedObjectType tc="4">Holding</RelatedObjectType> <RelationRoleCode tc="64">OLI_REL_REPLACEDBY</RelationRoleCode +> </Relation> <Relation id="Relation_daf1bb84-658a-4bad-aff7-86d5fc755101" Ori +ginatingObjectID="Holding_aa2b0594-77d6-4264-b277-0218e852cb36" Relat +edObjectID="Party_2d205fbf-cadd-4475-9d51-a8a8aea1c625"> <OriginatingObjectType tc="4">Holding</OriginatingObjectType> <RelatedObjectType tc="6">OLI_PARTY</RelatedObjectType> <RelationRoleCode tc="37">OLI_REL_PRIMAGENT</RelationRoleCode> </Relation> <Relation id="Relation_e944395e-0545-48f5-a38d-95b7cc08e73c" Ori +ginatingObjectID="Holding_aa2b0594-77d6-4264-b277-0218e852cb36" Relat +edObjectID="Party_869c61bc-96f9-497d-ad5b-a4341aec5311"> <OriginatingObjectType tc="4">Holding</OriginatingObjectType> <RelatedObjectType tc="6">OLI_PARTY</RelatedObjectType> <RelationRoleCode tc="8">OLI_REL_OWNER</RelationRoleCode> </Relation> <Relation id="Relation_80c5add6-3d38-4310-948c-79b442bc75ec" Ori +ginatingObjectID="Holding_aa2b0594-77d6-4264-b277-0218e852cb36" Relat +edObjectID="FormInstance_f76dede8-0d5e-43ae-8772-1d8aff465982"> <OriginatingObjectType tc="6">OLI_PARTY</OriginatingObjectType +> <RelatedObjectType tc="101">OLI_FORMINSTANCE</RelatedObjectTyp +e> <RelationRoleCode tc="107">OLI_REL_FORMFOR</RelationRoleCode> </Relation> <FormInstance id="FormInstance_f76dede8-0d5e-43ae-8772-1d8aff465 +982"> <FormName>APP</FormName> <Attachment id="Attachment_a1ac9dd9-4f6a-4cc3-bf00-e3e6d483a09 +a"> <DateCreated>2015-09-10</DateCreated> <AttachmentBasicType tc="1">OLI_LU_BASICATTMNTTY_TEXT</Attac +hmentBasicType> <Description>APP</Description> <AttachmentData></AttachmentData> <MimeTypeTC tc="17">OLI_INLINE</MimeTypeTC> <TransferEncodingTypeTC tc="4" /> <AttachmentLocation tc="1" /> </Attachment> </FormInstance> </OLifE> <OLifEExtension VendorCode="25" ExtensionCode="PROVIDER_VERSION"> <PROVIDER_VERSION>2.6.0.361</PROVIDER_VERSION> </OLifEExtension> </TXLifeRequest> </TXLife>

Any help would be greatly appreciated. Thank you


In reply to Xpath value query by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.