leighgable has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, I am new to programming in general and perl in particular, and I am afraid I am attempting to walk before I can crawl with a little code project.

I am using XML:Simple to parse search results, and hopefully in the future, correct for duplicate entries, return a tally of results, and return certain other data as well.

I am hoping one of you will kindly tell me where I am going wrong with my code to access a reference ID in the sample XML file below. When I run my routine on a sample file with two identical records, it works, returning two identical reference paths from the XML results. But when I run it on an XML file with only one result the program gives me a "not an array reference" error, and when I run it on a search with 100 plus results, I get an "not a hash reference" error.

Here is the code:
#!/usr/bin/perl # turn on perl safety features use strict; use warnings; # work out the name of the argument we're looking for my $file_in = $ARGV[0] or die "Must specify file on command line"; #Use module use XML::Simple; use Data::Dumper; # create object my $xml = new XML::Simple (KeyAttr=>[]); # read XML file my $data = $xml->XMLin($file_in); #declare array node variable my ($e); # dereference hash ref # access reference array foreach $e (@{$data->{ppsarticleresponse}->{ppsarticleresultset}->{pps +article}->{article}}) { print $e->{reference}, "\n"; } exit(0);
And here is a sample of the XML that XML:Simple is parsing for me:
<?xml version="1.0" encoding="utf-8"?> <ppsresponse> <ppsarticleresponse xmlns="http://global.factiva.com.ezproxy.insead. +edu/pps/2.0"> <ppsarticleresultset count="133"> <ppsarticle> <article> <accessionno>MTPW000020090731e57v004mr</accessionno> <reference>distdoc:archive/ArchiveDoc::Article/MTPW000020090 +731e57v004mr</reference> <baselanguage>EN</baselanguage> <copyright>(c) 2009 M2 Communications, Ltd. All Rights Reser +ved. </copyright> <headline> <paragraph display="Proportional" truncation="None" lang=" +EN">Anadys Pharmaceuticals, Inc (NASDAQ:ANDS) is the Highest Percenta +ge Gainers Among NASDAQ Stocks During Morning Trading Hours; Microsof +t Corporation (NASDAQ:<hlt>MSFT</hlt>) And Orthofix International NV +(NASDAQ:OFIX) Round Out Top Three Percentage Gainers During Morning T +rading Hours</paragraph> </headline> <leadparagraph> <paragraph display="Proportional" truncation="None">Dallas +, TX - LiquidTycoon.com is pleased to alert investors of stocks on th +e move.</paragraph> <paragraph display="Proportional" truncation="None">Anadys + Pharmaceuticals, Inc (NASDAQ:ANDS) is among the gainers on NASDAQ in + the early trade hours. The stock is up 60.0% to $2.88 with over 1.63 + million shares being traded within few minutes of trade. On July 30, + 2009, Anadys Pharmaceuticals, Inc announced the finalization of the +protocol for the Company's Phase II trial of ANA598 in combination wi +th pegylated interferon-alpha and ribavirin in hepatitis C patients. +Anadys Pharmaceuticals, Inc is a biopharmaceutical company focused to + develop medicines in the areas of hepatitis C and oncology. The Comp +any is developing ANA598, a small-molecule, non-nucleoside inhibitor +of the NS5B polymerase for the treatment of hepatitis C and ANA773, a +n oral Toll-like receptor 7 (TLR7) agonist prodrug for the treatment +of hepatitis C and cancer.</paragraph> </leadparagraph> <publicationdate> <date>2009-07-31</date> </publicationdate> <sourcename>M2 Presswire</sourcename> <company code="oficks"> <name>Orthofix International N.V.</name> <newsmentions>0</newsmentions> <newshits>0</newshits> </company> <company code="scrptg"> <name>Anadys Pharmaceuticals Inc</name> <newsmentions>0</newsmentions> <newshits>0</newshits> </company> <company code="mcrost"> <name>Microsoft Corporation</name> <newsmentions>0</newsmentions> <newshits>0</newshits> </company> <industry code="i3302"> <name>Computers/Electronics</name> </industry> <industry code="i3302021"> <name>Applications Software</name> </industry> <industry code="i257"> <name>Pharmaceuticals</name> </industry> <industry code="i330202"> <name>Software</name> </industry> <industry code="icomp"> <name>Computing</name> </industry> <industry code="i3302020"> <name>Systems Software</name> </industry> <industry code="i372"> <name>Medical Equipment/Supplies</name> </industry> <industry code="i951"> <name>Health Care</name> </industry> <industry code="iphmed"> <name>Medical/Surgical Instruments/Apparatus/Devices</name +> </industry> <region code="usa"> <name>United States</name> </region> <region code="namz"> <name>North American Countries/Regions</name> </region> <newssubject code="c42" position="0"> <name>Labor/Personnel Issues</name> </newssubject> <newssubject code="ghepat" position="0"> <name>Hepatitis</name> </newssubject> <newssubject code="mstock" position="0"> <name>Stock Exchanges</name> </newssubject> <newssubject code="npress" position="0"> <name>Press Release</name> </newssubject> <newssubject code="ccat" position="0"> <name>Corporate/Industrial News</name> </newssubject> <newssubject code="gcat" position="0"> <name>Political/General News</name> </newssubject> <newssubject code="ghea" position="0"> <name>Health</name> </newssubject> <newssubject code="gmed" position="0"> <name>Medical Conditions</name> </newssubject> <newssubject code="m11" position="0"> <name>Equity Markets</name> </newssubject> <newssubject code="mcat" position="0"> <name>Commodity/Financial Market News</name> </newssubject> <newssubject code="ncat" position="0"> <name>Content Types</name> </newssubject> <newssubject code="nfact" position="0"> <name>Factiva Filters</name> </newssubject> <newssubject code="nfce" position="0"> <name>FC&amp;E Exclusion Filter</name> </newssubject> <newssubject code="nfcpin" position="0"> <name>FC&amp;E Industry News Filter</name> </newssubject> <sourcecode>MTPW</sourcecode> <tailparagraphs> <paragraph display="Proportional" truncation="None">Micros +oft Corporation (NASDAQ:<hlt>MSFT</hlt>) is gaining momentum in the e +arly morning trading hours. The stock moved up 0.67% to $23.98 on a t +hinly traded volume of 94k shares within few minutes of trade. On Jul +y 30, 2009, Microsoft Corp announced that it is collaborating with co +mScore Inc to develop a digital media planning solution, named the Re +ach and Frequency Planner (RF Planner), which will allow brand advert +isers to predict reach, frequency and audience composition at the ad +placement level. Microsoft Corporation develops, manufactures, licens +es and supports a range of software products for computing devices. T +he Company's software products include operating systems for servers, + personal computers and intelligent devices, server applications for +distributed computing environments, information worker productivity a +pplications, business solution applications, high-performance computi +ng applications and software development tools and video games.</para +graph> <paragraph display="Proportional" truncation="None">Orthof +ix International NV (NASDAQ:OFIX) climbed over 20.0% to $28.85 on a l +ow traded volume of 94k shares. The 52 week range for the stock is $3 +0.47 and $8.65. On July 30, 2009, Orthofix International N.V. announc +ed that based on its results for the first half of 2009, the Company +reaffirmed its fiscal 2009 guidance. Orthofix International N.V. is m +edical device company offering a line of surgical and non-surgical pr +oducts for the spine, orthopedics, sports medicine and vascular marke +t sectors. Its products are designed to address the lifelong bone-and +-joint health needs of patients of all ages, and to help them achieve + a more active and mobile lifestyle.</paragraph> <paragraph display="Proportional" truncation="None">Liquid + Tycoon alerts its members on stocks that could generate higher than +average returns. These include stocks with huge volume, penny stocks +that are moving to the upside quickly, penny stocks with buy signals +and companies with news. Our alerts are well known for producing incr +edible results in a short amount of time and our members have made ou +tstanding profits of over 100%.</paragraph> <paragraph display="Proportional" truncation="None">ABOUT +Liquid Tycoon</paragraph> <paragraph display="Proportional" truncation="None">Liquid +Tycoon.com is a leading stock web site that provides free alerts on s +tocks that are poised to make big gains. LiquidTycoon.com also tracks + small cap penny stocks that could be on the brink of a massive break +out. To feature a company on our web site please contact us at info@ +LiquidTycoon.com</paragraph> <paragraph display="Proportional" truncation="None">Liquid + Tycoon is an independent electronic publication that provides inform +ation on selected publicly traded companies. Liquid Tycoon is not a r +egistered investment advisor or broker-dealer. Liquid Tycoon affiliat +es, officers, directors and employees may buy and sell additional sha +res in any company mentioned herein and may profit in the event those + shares rise in value. Please do your own Due Diligence before invest +ing in any of the stocks mentioned above.</paragraph> <paragraph display="Proportional" truncation="None">M2 Com +munications disclaims all liability for information provided within M +2 PressWIRE. Data prepared by named party/parties. Further informatio +n on M2 PressWIRE can be obtained at <elink type="webpage" reference= +"http://www.presswire.net">http://www.presswire.net</elink> on the wo +rld wide web. Inquiries to info@m2.com.</paragraph> </tailparagraphs> <contact>Liquid Tycoon | e-mail: info@LiquidTycoon.com | Tel: +1 214 556 6798 </contact> <logo source="http://logos.factiva.com.ezproxy.insead.edu" i +mage="mtpwLogo.gif"></logo> <wordcount>679</wordcount> </article> <article> <accessionno>MTPW000020090731e57v004mr</accessionno> <reference>distdoc:archive/ArchiveDoc::Article/MTPW000020090 +731e57v004mr</reference> <baselanguage>EN</baselanguage> <copyright>(c) 2009 M2 Communications, Ltd. All Rights Reser +ved. </copyright> <headline> <paragraph display="Proportional" truncation="None" lang=" +EN">Anadys Pharmaceuticals, Inc (NASDAQ:ANDS) is the Highest Percenta +ge Gainers Among NASDAQ Stocks During Morning Trading Hours; Microsof +t Corporation (NASDAQ:<hlt>MSFT</hlt>) And Orthofix International NV +(NASDAQ:OFIX) Round Out Top Three Percentage Gainers During Morning T +rading Hours</paragraph> </headline> <leadparagraph> <paragraph display="Proportional" truncation="None">Dallas +, TX - LiquidTycoon.com is pleased to alert investors of stocks on th +e move.</paragraph> <paragraph display="Proportional" truncation="None">Anadys + Pharmaceuticals, Inc (NASDAQ:ANDS) is among the gainers on NASDAQ in + the early trade hours. The stock is up 60.0% to $2.88 with over 1.63 + million shares being traded within few minutes of trade. On July 30, + 2009, Anadys Pharmaceuticals, Inc announced the finalization of the +protocol for the Company's Phase II trial of ANA598 in combination wi +th pegylated interferon-alpha and ribavirin in hepatitis C patients. +Anadys Pharmaceuticals, Inc is a biopharmaceutical company focused to + develop medicines in the areas of hepatitis C and oncology. The Comp +any is developing ANA598, a small-molecule, non-nucleoside inhibitor +of the NS5B polymerase for the treatment of hepatitis C and ANA773, a +n oral Toll-like receptor 7 (TLR7) agonist prodrug for the treatment +of hepatitis C and cancer.</paragraph> </leadparagraph> <publicationdate> <date>2009-07-31</date> </publicationdate> <sourcename>M2 Presswire</sourcename> <company code="oficks"> <name>Orthofix International N.V.</name> <newsmentions>0</newsmentions> <newshits>0</newshits> </company> <company code="scrptg"> <name>Anadys Pharmaceuticals Inc</name> <newsmentions>0</newsmentions> <newshits>0</newshits> </company> <company code="mcrost"> <name>Microsoft Corporation</name> <newsmentions>0</newsmentions> <newshits>0</newshits> </company> <industry code="i3302"> <name>Computers/Electronics</name> </industry> <industry code="i3302021"> <name>Applications Software</name> </industry> <industry code="i257"> <name>Pharmaceuticals</name> </industry> <industry code="i330202"> <name>Software</name> </industry> <industry code="icomp"> <name>Computing</name> </industry> <industry code="i3302020"> <name>Systems Software</name> </industry> <industry code="i372"> <name>Medical Equipment/Supplies</name> </industry> <industry code="i951"> <name>Health Care</name> </industry> <industry code="iphmed"> <name>Medical/Surgical Instruments/Apparatus/Devices</name +> </industry> <region code="usa"> <name>United States</name> </region> <region code="namz"> <name>North American Countries/Regions</name> </region> <newssubject code="c42" position="0"> <name>Labor/Personnel Issues</name> </newssubject> <newssubject code="ghepat" position="0"> <name>Hepatitis</name> </newssubject> <newssubject code="mstock" position="0"> <name>Stock Exchanges</name> </newssubject> <newssubject code="npress" position="0"> <name>Press Release</name> </newssubject> <newssubject code="ccat" position="0"> <name>Corporate/Industrial News</name> </newssubject> <newssubject code="gcat" position="0"> <name>Political/General News</name> </newssubject> <newssubject code="ghea" position="0"> <name>Health</name> </newssubject> <newssubject code="gmed" position="0"> <name>Medical Conditions</name> </newssubject> <newssubject code="m11" position="0"> <name>Equity Markets</name> </newssubject> <newssubject code="mcat" position="0"> <name>Commodity/Financial Market News</name> </newssubject> <newssubject code="ncat" position="0"> <name>Content Types</name> </newssubject> <newssubject code="nfact" position="0"> <name>Factiva Filters</name> </newssubject> <newssubject code="nfce" position="0"> <name>FC&amp;E Exclusion Filter</name> </newssubject> <newssubject code="nfcpin" position="0"> <name>FC&amp;E Industry News Filter</name> </newssubject> <sourcecode>MTPW</sourcecode> <tailparagraphs> <paragraph display="Proportional" truncation="None">Micros +oft Corporation (NASDAQ:<hlt>MSFT</hlt>) is gaining momentum in the e +arly morning trading hours. The stock moved up 0.67% to $23.98 on a t +hinly traded volume of 94k shares within few minutes of trade. On Jul +y 30, 2009, Microsoft Corp announced that it is collaborating with co +mScore Inc to develop a digital media planning solution, named the Re +ach and Frequency Planner (RF Planner), which will allow brand advert +isers to predict reach, frequency and audience composition at the ad +placement level. Microsoft Corporation develops, manufactures, licens +es and supports a range of software products for computing devices. T +he Company's software products include operating systems for servers, + personal computers and intelligent devices, server applications for +distributed computing environments, information worker productivity a +pplications, business solution applications, high-performance computi +ng applications and software development tools and video games.</para +graph> <paragraph display="Proportional" truncation="None">Orthof +ix International NV (NASDAQ:OFIX) climbed over 20.0% to $28.85 on a l +ow traded volume of 94k shares. The 52 week range for the stock is $3 +0.47 and $8.65. On July 30, 2009, Orthofix International N.V. announc +ed that based on its results for the first half of 2009, the Company +reaffirmed its fiscal 2009 guidance. Orthofix International N.V. is m +edical device company offering a line of surgical and non-surgical pr +oducts for the spine, orthopedics, sports medicine and vascular marke +t sectors. Its products are designed to address the lifelong bone-and +-joint health needs of patients of all ages, and to help them achieve + a more active and mobile lifestyle.</paragraph> <paragraph display="Proportional" truncation="None">Liquid + Tycoon alerts its members on stocks that could generate higher than +average returns. These include stocks with huge volume, penny stocks +that are moving to the upside quickly, penny stocks with buy signals +and companies with news. Our alerts are well known for producing incr +edible results in a short amount of time and our members have made ou +tstanding profits of over 100%.</paragraph> <paragraph display="Proportional" truncation="None">ABOUT +Liquid Tycoon</paragraph> <paragraph display="Proportional" truncation="None">Liquid +Tycoon.com is a leading stock web site that provides free alerts on s +tocks that are poised to make big gains. LiquidTycoon.com also tracks + small cap penny stocks that could be on the brink of a massive break +out. To feature a company on our web site please contact us at info@ +LiquidTycoon.com</paragraph> <paragraph display="Proportional" truncation="None">Liquid + Tycoon is an independent electronic publication that provides inform +ation on selected publicly traded companies. Liquid Tycoon is not a r +egistered investment advisor or broker-dealer. Liquid Tycoon affiliat +es, officers, directors and employees may buy and sell additional sha +res in any company mentioned herein and may profit in the event those + shares rise in value. Please do your own Due Diligence before invest +ing in any of the stocks mentioned above.</paragraph> <paragraph display="Proportional" truncation="None">M2 Com +munications disclaims all liability for information provided within M +2 PressWIRE. Data prepared by named party/parties. Further informatio +n on M2 PressWIRE can be obtained at <elink type="webpage" reference= +"http://www.presswire.net">http://www.presswire.net</elink> on the wo +rld wide web. Inquiries to info@m2.com.</paragraph> </tailparagraphs> <contact>Liquid Tycoon | e-mail: info@LiquidTycoon.com | Tel: +1 214 556 6798 </contact> <logo source="http://logos.factiva.com.ezproxy.insead.edu" i +mage="mtpwLogo.gif"></logo> <wordcount>679</wordcount> </article> </ppsarticle> </ppsarticleresultset> </ppsarticleresponse> </ppsresponse>

Replies are listed 'Best First'.
Re: Parsing XML: Not an Array/Hash Reference Error
by toolic (Bishop) on Aug 14, 2009 at 15:14 UTC
    If I use XML::Twig, this code works if I have 1 'article' element, or 2 or 3 (I didn't try more than 3, but I believe it will work for 100). The output below shows the case with 2 articles (which is your XML example):
    use strict; use warnings; use XML::Twig; my $file_in = shift; my $twig= new XML::Twig( twig_handlers => { reference => \&reference } ); $twig->parsefile($file_in); exit; sub reference { my ($twig, $refer) = @_; print $refer->text(), "\n"; } __END__ distdoc:archive/ArchiveDoc::Article/MTPW000020090731e57v004mr distdoc:archive/ArchiveDoc::Article/MTPW000020090731e57v004mr
      Hello Toolic,

      I am going through the XML::Twig documentation now. I think it's exactly what I need for my project.

      Your program is my first look at the parameter array. It's really cool what you did!

      Thanks, Leigh
Re: Parsing XML: Not an Array/Hash Reference Error
by SuicideJunkie (Vicar) on Aug 14, 2009 at 14:42 UTC
    1. use Data::Dumper; at the top
    2. Just before the line where the error appears, print Dumper $variable;
    This will let you see exactly what you have in hand. If necessary, back up to where you assign that variable and dumper the source's contents.

    Once you can see what you're working with, it is MUCH easier to navigate the reference tree.

      Hello SJ,

      I should have mentioned in my original post that I was getting the error in the foreach loop. Do you have a suggestion for where I could call Dumper to diagnose the problem?

      If I call the node variable $e before the foreach loop it's undefined, but if I call it within, the program fails before Dumper can return a value.

      All the best, Leigh
Re: Parsing XML: Not an Array/Hash Reference Error
by Jenda (Abbot) on Aug 15, 2009 at 13:22 UTC
      Thanks Jenda,

      That must be the problem--the same repeating tag is represented as different reference types.

      Regards, Leigh