jbush82 has asked for the wisdom of the Perl Monks concerning the following question:

I've been struggling with XML::Twig for sometime now and I'm posting here to hopefully get some clarification. I'm trying to read an XML document, but I've not been able to access the data like I want to.

The following is an example of the XML data I'm reading:

<?xml version="1.0" encoding="UTF-8"?> <Etailer_Catalog_xml> <product> <STOCK_CODE>010003</STOCK_CODE> <UPC>705077000440</UPC> <Basic_Description>GABA 100GM</Basic_Description> <Brand>AST Sports Science</Brand> <ProductLine/> <ItemName>GABA</ItemName> <Title>AST Sports Science GABA</Title> <Flavor/> <Supplier_Number>1</Supplier_Number> <Supplier_Name>AST SPORTS SCIENCE</Supplier_Name> <Primary_Category>Amino Acids</Primary_Category> <General_Category>Supplements</General_Category> <WHOLESALE_PRICE>17.47</WHOLESALE_PRICE> <RETAIL_PRICE>24.95</RETAIL_PRICE> <LIST_DATE>2004-05-12</LIST_DATE> <DISC/> <CLT_QOH>Yes</CLT_QOH> <FRE_QOH>No</FRE_QOH> <MES_QOH>No</MES_QOH> <STR_QOH>No</STR_QOH> <WND_QOH>No</WND_QOH> <ORL_QOH>No</ORL_QOH> <HasNutrition>1</HasNutrition> <ValuePreparedCount>0</ValuePreparedCount> <Address>120 Capital Drive Golden, CO 80401</Address> <Copyright>2007 AST Sports Science, Inc.</Copyright> <ItemSize>100</ItemSize> <ItemMeasure>g</ItemMeasure> <Height>4.625</Height> <Width>2.375</Width> <Depth>2.375</Depth> <ProductWeight/> <MASS>0.313</MASS> <ExtendedSize>100 g (3.53 oz)</ExtendedSize> <CASE_QUANTITY>12</CASE_QUANTITY> <Description>GABA, Growth Hormone Potentiator</Description> <ProductDetails>GABA is an amino acid classified as a neurotransmi +tter. Studies have shown GABA to play a key role in the secretion of +Growth Hormone. The principle anabolic actions of Growth Hormone incl +uding the stimulation of amino acid transport, simulation of protein +synthesis reduction of body-fat and the proliferation of cell growth. + AST Sports Science selectively imports GABA under rigid quality cont +rol conditions. Each batch is HPLC Certified and Laboratory Tested fo +r purity and potency. GABA is a naturally occurring amino acid classi +fied as a neurotransmitter. Some individuals may experience a minor t +ingling of skin and/ or slight shortness of breath shortly after taki +ng GABA. This is characteristic of this amino acid and quickly subsid +es.</ProductDetails> <Directions>For adults only. As a dietary supplement, take three t +o five grams mixed with 8 ounce of water, juice or protein shake appr +oximately 30 minutes before sleep.</Directions> <Ingredients/> <DrugInteractions/> <Warnings/> <PostDate>2009-08-25</PostDate> <HTML>http://www.ast-ss.com</HTML> <thumbnail_url>http://www.europadatafeed.com/images/50/705077000 +440.gif</thumbnail_url> <image_url>http://www.europadatafeed.com/images/250/705077000440 +.jpg</image_url> <logo_url>http://www.europadatafeed.com/images/logos/ast.gif</lo +go_url> <image500_URL>http://www.europadatafeed.com/images/500/70507700044 +0.jpg</image500_URL> <MAP_Price/> <image_name>705077000440</image_name> <image100_URL>http://www.europadatafeed.com/images/100/70507700044 +0.jpg</image100_URL> <NUTRIENTS> </NUTRIENTS> </product> <product> <STOCK_CODE>010006</STOCK_CODE> <UPC>705077002246</UPC> <Basic_Description>L-GLUTAMINE GL3 POWDER 300GM</Basic_Description +> <Brand>AST Sports Science</Brand> <ProductLine>GL3</ProductLine> <ItemName>Micronized L-Glutamine</ItemName> <Title>AST Sports Science GL3 Micronized L-Glutamine</Title> <Flavor/> <Supplier_Number>1</Supplier_Number> <Supplier_Name>AST SPORTS SCIENCE</Supplier_Name> <Primary_Category>Sport Performance</Primary_Category> <General_Category>Supplements</General_Category> <WHOLESALE_PRICE>14.97</WHOLESALE_PRICE> <RETAIL_PRICE>24.95</RETAIL_PRICE> <LIST_DATE>2001-08-25</LIST_DATE> <DISC>DISC</DISC> <CLT_QOH>No</CLT_QOH> <FRE_QOH>No</FRE_QOH> <MES_QOH>No</MES_QOH> <STR_QOH>No</STR_QOH> <WND_QOH>No</WND_QOH> <ORL_QOH>No</ORL_QOH> <HasNutrition>1</HasNutrition> <ValuePreparedCount>0</ValuePreparedCount> <Address>120 Capital Drive Golden, CO 80401</Address> <Copyright>2002 AST Sports Science, Inc.</Copyright> <ItemSize>300</ItemSize> <ItemMeasure>g</ItemMeasure> <Height>5.8</Height> <Width>3.05</Width> <Depth>3.05</Depth> <ProductWeight>13.1</ProductWeight> <MASS>0.763</MASS> <ExtendedSize>300 g (10.58 oz)</ExtendedSize> <CASE_QUANTITY>12</CASE_QUANTITY> <Description>Micronized L-Glutamine</Description> <ProductDetails>Dietary Supplement. Pure HPLC tested pharmaceutica +l grade. Boosts circulating growth hormone levels by over 430%! Gluta +mine represents over 63% of the amino acid content in the amino acid +pool of muscle tissue. Intense training and exercise put greater dema +nds on your body's need for Glutamine, making it a conditionally esse +ntial amino acid. Micronized GL3 L-Glutamine represents a leap forwar +d in Glutamine supplementation. GL3 uses a state-of-the-art Particle +Micronization Technology (PMT). Each tiny particle of GL3 is 20 times + smaller than regular Glutamine powder. This allows for ultra-fast ab +sorption and utilization. Glutamine is the most common amino acid in +the body and is key to the metabolism and maintenance of muscle tissu +e. Glutamine is also the highest concentrated amino acid in the muscl +e cell and acts as a primary shuttle for nitrogen from the blood stre +am to inside the muscle cell. Glutamine has a multifaceted role in hu +man nutrition and is essential for the support of muscle tissue and i +mmune function. (Statements made have not been evaluated by the Food +and Drug Administration. This product is not intended to diagnose, tr +eat, cure or prevent any disease.)</ProductDetails> <Directions>Recommended Use: As a dietary supplement, mix 2 teaspo +ons (10 grams) in 6-8 ounces of water or juice. Drink 1 serving 30 mi +nutes before training and 1 serving 30 minutes after training.</Direc +tions> <Ingredients/> <DrugInteractions/> <Warnings/> <PostDate>2003-06-06</PostDate> <HTML>http://www.ast-ss.com</HTML> <thumbnail_url>http://www.europadatafeed.com/images/50/705077002 +246.gif</thumbnail_url> <image_url>http://www.europadatafeed.com/images/250/705077002246 +.jpg</image_url> <logo_url>http://www.europadatafeed.com/images/logos/ast.gif</lo +go_url> <image500_URL>http://www.europadatafeed.com/images/500/no_image.jp +g</image500_URL> <MAP_Price/> <image_name>705077002246</image_name> <image100_URL>http://www.europadatafeed.com/images/100/70507700224 +6.jpg</image100_URL> <NUTRIENTS> </NUTRIENTS> </product> <product> <STOCK_CODE>010030</STOCK_CODE> <UPC>705077002123</UPC> <Basic_Description>MULTIPRO 32X 100CAPS</Basic_Description> <Brand>AST Sports Science</Brand> <ProductLine/> <ItemName>Multi Pro-32X</ItemName> <Title>AST Sports Science Multi Pro-32X</Title> <Flavor/> <Supplier_Number>1</Supplier_Number> <Supplier_Name>AST SPORTS SCIENCE</Supplier_Name> <Primary_Category>Vitamins / Minerals</Primary_Category> <General_Category>Supplements</General_Category> <WHOLESALE_PRICE>14.97</WHOLESALE_PRICE> <RETAIL_PRICE>19.95</RETAIL_PRICE> <LIST_DATE>2001-08-25</LIST_DATE> <DISC/> <CLT_QOH>Yes</CLT_QOH> <FRE_QOH>Yes</FRE_QOH> <MES_QOH>Yes</MES_QOH> <STR_QOH>Yes</STR_QOH> <WND_QOH>Yes</WND_QOH> <ORL_QOH>Yes</ORL_QOH> <HasNutrition>1</HasNutrition> <ValuePreparedCount>0</ValuePreparedCount> <Address>P.O. Box 4327 Evergreen, CO 80437</Address> <Copyright>2000 AST Sports Science</Copyright> <ItemSize>100</ItemSize> <ItemMeasure>ea</ItemMeasure> <Height>4.5</Height> <Width>2.5</Width> <Depth>2.5</Depth> <ProductWeight/> <MASS>0.48</MASS> <ExtendedSize>100 Caplets</ExtendedSize> <CASE_QUANTITY>12</CASE_QUANTITY> <Description>Multi Pro-32X, 100 Caplets</Description> <ProductDetails>The Serious Athlete's Multi-Vitamin Multi-Mineral. +</ProductDetails> <Directions>Take one caplet twice daily - AM/PM.</Directions> <Ingredients>Croscarmellose Sodium, Microcrystalline Cellulose, Ma +gnesium Stearate.</Ingredients> <DrugInteractions/> <Warnings>Accidental overdose of iron-containing products is a lea +ding cause of fatal poisoning in children under 6. Keep this product +out of reach of children. In case of accidental overdose, call a doct +or or poison control center immediately.</Warnings> <PostDate>2009-08-25</PostDate> <HTML>http://www.ast-ss.com</HTML> <thumbnail_url>http://www.europadatafeed.com/images/50/705077002 +123.gif</thumbnail_url> <image_url>http://www.europadatafeed.com/images/250/705077002123 +.jpg</image_url> <logo_url>http://www.europadatafeed.com/images/logos/ast.gif</lo +go_url> <image500_URL>http://www.europadatafeed.com/images/500/70507700212 +3.jpg</image500_URL> <MAP_Price/> <image_name>705077002123</image_name> <image100_URL>http://www.europadatafeed.com/images/100/70507700212 +3.jpg</image100_URL> <NUTRIENTS> </NUTRIENTS> </product> <product> <STOCK_CODE>010036</STOCK_CODE> <UPC>705077002505</UPC> <Basic_Description>CLA 1000MG 90 SOFTGELS</Basic_Description> <Brand>AST Sports Science</Brand> <ProductLine/> <ItemName>CLA 1000</ItemName> <Title>AST Sports Science CLA 1000</Title> <Flavor/> <Supplier_Number>1</Supplier_Number> <Supplier_Name>AST SPORTS SCIENCE</Supplier_Name> <Primary_Category>Sport Performance</Primary_Category> <General_Category>Supplements</General_Category> <WHOLESALE_PRICE>20.97</WHOLESALE_PRICE> <RETAIL_PRICE>29.95</RETAIL_PRICE> <LIST_DATE>2001-08-25</LIST_DATE> <DISC/> <CLT_QOH>Yes</CLT_QOH> <FRE_QOH>Yes</FRE_QOH> <MES_QOH>Yes</MES_QOH> <STR_QOH>Yes</STR_QOH> <WND_QOH>Yes</WND_QOH> <ORL_QOH>Yes</ORL_QOH> <HasNutrition>1</HasNutrition> <ValuePreparedCount>0</ValuePreparedCount> <Address>120 Capital Drive, Golden, CO 80401</Address> <Copyright>2009 AST Sports Science</Copyright> <ItemSize>90</ItemSize> <ItemMeasure>ea</ItemMeasure> <Height>5.0</Height> <Width>2.5</Width> <Depth>2.5</Depth> <ProductWeight/> <MASS>0.38</MASS> <ExtendedSize>90 Softgel Capsules</ExtendedSize> <CASE_QUANTITY>12</CASE_QUANTITY> <Description>CLA 1000, 90 Softgel Capsules</Description> <ProductDetails>Dietary Supplement. CLA CONJUGATED LINOLEIC ACID i +s not a stimulant or drug; it is a naturally occurring, unsaturated ( +'Good') fatty acid that is safe and side effect free. CLA's unique 'e +ngineered lipid' profile is research proven to help reduce body fat w +hile at the same time maintain lean muscle mass. Taken daily, CLA 100 +0 may promote increased metabolic function supporting lean muscle and + reducing body fat. (The statements on this label have not been evalu +ated by the Food and Drug Administration. This product is not intende +d to diagnose, treat, cure or prevent any disease.)</ProductDetails> <Directions>As a dietary supplement, take 1-2 softgels 2-3 times d +aily with meals.</Directions> <Ingredients>Gelatin, glycerin, purified water, carob.</Ingredient +s> <DrugInteractions/> <Warnings>Keep container lightly closed in a cool, dry and dark pl +ace. Keep out of the reach of children.</Warnings> <PostDate>2010-01-08</PostDate> <HTML>http://www.ast-ss.com</HTML> <thumbnail_url>http://www.europadatafeed.com/images/50/705077002 +505.gif</thumbnail_url> <image_url>http://www.europadatafeed.com/images/250/705077002505 +.jpg</image_url> <logo_url>http://www.europadatafeed.com/images/logos/ast.gif</lo +go_url> <image500_URL>http://www.europadatafeed.com/images/500/70507700250 +5.jpg</image500_URL> <MAP_Price/> <image_name>705077002505</image_name> <image100_URL>http://www.europadatafeed.com/images/100/70507700250 +5.jpg</image100_URL> <NUTRIENTS> </NUTRIENTS> </product> </Etailer_Catalog_xml>

The following is the code that I have to read the data. Above is just a small piece of a very large XML file, thus I need to "walk" it to ensure I don't eat up a lot of memory (twig_roots):

#!/usr/bin/perl use strict; use warnings; use XML::Twig; my $t= XML::Twig->new( # the twig will include just the root and selected titles twig_roots => { 'product/STOCK_CODE' => \&print_elt_text, 'product/WHOLESALE_PRICE' => \&print_elt_text, 'product/MAP_Price' => \&print_elt_text, 'product/DISC' => \&print_elt_text } ); $t->parsefile( './workingfiles/eur/eurfeed.xml'); sub print_elt_text { my( $t, $elt)= @_; print $elt->text; # print the text (including sub-element texts) $t->purge; # frees the memory }

My problem is that I'm unable to access the individual STOCK_CODE, WHOLESALE_PRICE, MAP_Price, and DISC elements within the XML document. Currently, the elements are being fed to the sub-routine but are part of $elt and I'm unable to access them independantly of each other.

Any direction would be appreciated.

---------------------

I think I figured it out, all I had to do was post here asking how to do it and then it would come to me... /sigh

#!/usr/bin/perl use strict; use warnings; use XML::Twig; my $t= XML::Twig->new( # the twig will include just the root and selected titles twig_roots => { 'product' => \&print_elt_text } ); $t->parsefile( './workingfiles/eur/eurfeed.xml'); sub print_elt_text { my( $t, $elt)= @_; my $prodcode = $elt->first_child( 'STOCK_CODE')->text; my $wholesale_price = $elt->first_child( 'WHOLESALE_PRICE')->text; my $map = $elt->first_child( 'MAP_Price')->text; my $disc = $elt->first_child( 'DISC')->text; print "Product ID: " . $prodcode . "\n"; print "Wholesale Price: " . $wholesale_price . "\n"; print "MAP: " . $map . "\n"; print "Disconnected: " . $disc . "\n\n"; $t->purge; # frees the memory }

Feel free to review and provide comments.

Replies are listed 'Best First'.
Re: XML::Twig and twig_roots - Accessing Individual XML Elements
by ikegami (Patriarch) on Mar 23, 2010 at 06:27 UTC

    My problem is that I'm unable to access the individual STOCK_CODE, WHOLESALE_PRICE, MAP_Price, and DISC elements within the XML document.

    That's exactly what your initial code did. The thing is, you don't want them individually, you want to access those elements of each product as a group (judging by your second program). As such, the element you are really interested in is the product element. Since that's exactly what your second program does, it's indeed a perfectly fine solution.

      You're right, thank you for clearing that up, as I'm sure it may have caused some confusion saying that I needed to access those elements when I already was.

Re: XML::Twig and twig_roots - Accessing Individual XML Elements
by sierpinski (Chaplain) on Mar 23, 2010 at 12:34 UTC
    Thanks for posting a followup with your solution (congrats on figuring it out -- sometimes you just need to write the problem down to get the creative juices flowin'). It's very helpful for posterity when people post their solutions.

    So frustrating to google/search for a problem only to find a bunch of other people have the same problem, but with no clear solution.

      I completely agree and have felt the frustration of several people asking a question in several different forums and then replying with a, "Thanks, sovled." to their own thread.