in reply to Perl & Simple::XML

Helo again! Everything works normal now. My code is:
#!/usr/bin/perl # use module use XML::Simple; use Data::Dumper; # create object $xml = new XML::Simple (KeyAttr=>[]); #read XML file $data = $xml->XMLin("data.xml"); #dereference hash ref #print Dumper($data); print $data->{PMID}, "\n"; print $data->{Article}->{ArticleTitle}, "\n"; foreach $e (@{$data->{Article}->{AuthorList}->{Author}}) { $authors.= $e->{LastName}." ".$e->{Initials}.', '; } print $data->{Article}->{Journal}->{ISOAbbreviation}, " "; print $data->{Article}->{Journal}->{JournalIssue}->{PubDate}->{Year}, +";", ; print $data->{Article}->{Journal}->{JournalIssue}->{Volume}, ":"; print $data->{Article}->{Pagination}->{MedlinePgn}, "." ; print "\n";
and gets all the details I need. BUT there is a problem when I try to put 2 XMLs together, like:
<?xml version='1.0'?> <PubmedArticle> <PMID>1766380</PMID> <Article PubModel="Print"> <Journal> <JournalIssue CitedMedium="Print"> <Volume>5</Volume> <Issue>9</Issue> <PubDate> <Year>1991</Year> <Month>Sep</Month> </PubDate> </JournalIssue> <ISOAbbreviation>Mol. Microbiol.</ISOAbbreviation> </Journal> <ArticleTitle>PhoP/PhoQ: macrophage-specific modulator +s of Salmonella virulence?</ArticleTitle> <Pagination> <MedlinePgn>2073-8</MedlinePgn> </Pagination> <AuthorList CompleteYN="Y"> <Author ValidYN="Y"> <LastName>Miller</LastName> <ForeName>S I</ForeName> <Initials>SI</Initials> </Author> <Author ValidYN="Y"> <LastName>Tsirigos</LastName> <ForeName>K T</ForeName> <Initials>KT</Initials> </Author> <Author ValidYN="Y"> <LastName>Dinous</LastName> <ForeName>A E</ForeName> <Initials>AE</Initials> </Author> </AuthorList> </Article> </PubmedArticle> <PubmedArticle> <MedlineCitation Owner="NLM" Status="MEDLINE"> <PMID>16039843</PMID> <DateCreated> <Year>2005</Year> <Month>08</Month> <Day>01</Day> </DateCreated> <DateCompleted> <Year>2005</Year> <Month>12</Month> <Day>08</Day> </DateCompleted> <DateRevised> <Year>2006</Year> <Month>11</Month> <Day>15</Day> </DateRevised> <Article PubModel="Print"> <Journal> <ISSN IssnType="Print">0959-440X</ISSN> <JournalIssue CitedMedium="Print"> <Volume>15</Volume> <Issue>4</Issue> <PubDate> <Year>2005</Year> <Month>Aug</Month> </PubDate> </JournalIssue> <Title>Current opinion in structural biology</Titl +e> <ISOAbbreviation>Curr. Opin. Struct. Biol.</ISOAbb +reviation> </Journal> <ArticleTitle>TonB-dependent outer membrane transport: + going for Baroque?</ArticleTitle> <Pagination> <MedlinePgn>394-400</MedlinePgn> </Pagination> <Abstract> <AbstractText>The import of essential organometall +ic micronutrients (such as iron-siderophores and vitamin B(12)) acros +s the outer membrane of Gram-negative bacteria proceeds via TonB-depe +ndent outer membrane transporters (TBDTs). The TBDT couples to the To +nB protein, which is part of a multiprotein complex in the plasma (in +ner) membrane. Five crystal structures of TBDTs illustrate clearly th +e architecture of the protein in energy-independent substrate-free an +d substrate-bound states. In each of the TBDT structures, an N-termin +al hatch (or plug or cork) domain occludes the lumen of a 22-stranded + beta barrel. The manner by which substrate passes through the transp +orter (the "hatch-barrel problem") is currently unknown. Solution NMR + and X-ray crystallographic structures of various TonB domains indica +te a striking structural plasticity of this protein. Thermodynamic, b +iochemical and bacteriological studies of TonB and TBDTs indicate fur +ther that existing structures do not yet capture critical energy-depe +ndent and in vivo conformations of the transport cycle. The reconcili +ation of structural and non-structural experimental data, and the una +mbiguous experimental elucidation of a detailed molecular mechanism o +f transport are current challenges for this field.</AbstractText> </Abstract> <Affiliation>Department of Molecular Physiology and Bi +ological Physics, University of Virginia, PO Box 800736, Charlottesvi +lle, VA 22908-0736, USA. mwiener@virginia.edu</Affiliation> <AuthorList CompleteYN="Y"> <Author ValidYN="Y"> <LastName>Wiener</LastName> <ForeName>Michael C</ForeName> <Initials>MC</Initials> </Author> </AuthorList> <Language>eng</Language> <GrantList CompleteYN="Y"> <Grant> <GrantID>DK 59999</GrantID> <Acronym>DK</Acronym> <Agency>NIDDK</Agency> </Grant> </GrantList> <PublicationTypeList> <PublicationType>Journal Article</PublicationType> <PublicationType>Research Support, N.I.H., Extramu +ral</PublicationType> <PublicationType>Research Support, U.S. Gov't, P.H +.S.</PublicationType> <PublicationType>Review</PublicationType> </PublicationTypeList> </Article> <MedlineJournalInfo> <Country>England</Country> <MedlineTA>Curr Opin Struct Biol</MedlineTA> <NlmUniqueID>9107784</NlmUniqueID> </MedlineJournalInfo> <ChemicalList> <Chemical> <RegistryNumber>0</RegistryNumber> <NameOfSubstance>Bacterial Outer Membrane Proteins +</NameOfSubstance> </Chemical> <Chemical> <RegistryNumber>0</RegistryNumber> <NameOfSubstance>Bacterial Proteins</NameOfSubstan +ce> </Chemical> <Chemical> <RegistryNumber>0</RegistryNumber> <NameOfSubstance>Membrane Proteins</NameOfSubstanc +e> </Chemical> <Chemical> <RegistryNumber>0</RegistryNumber> <NameOfSubstance>Multiprotein Complexes</NameOfSub +stance> </Chemical> <Chemical> <RegistryNumber>0</RegistryNumber> <NameOfSubstance>tonB protein, Bacteria</NameOfSub +stance> </Chemical> </ChemicalList> <CitationSubset>IM</CitationSubset> <MeshHeadingList> <MeshHeading> <DescriptorName MajorTopicYN="Y">Bacterial Outer M +embrane Proteins</DescriptorName> <QualifierName MajorTopicYN="N">chemistry</Qualifi +erName> <QualifierName MajorTopicYN="N">metabolism</Qualif +ierName> </MeshHeading> <MeshHeading> <DescriptorName MajorTopicYN="Y">Bacterial Protein +s</DescriptorName> <QualifierName MajorTopicYN="N">chemistry</Qualifi +erName> <QualifierName MajorTopicYN="N">metabolism</Qualif +ierName> </MeshHeading> <MeshHeading> <DescriptorName MajorTopicYN="N">Biological Transp +ort</DescriptorName> <QualifierName MajorTopicYN="N">physiology</Qualif +ierName> </MeshHeading> <MeshHeading> <DescriptorName MajorTopicYN="N">Crystallography, +X-Ray</DescriptorName> </MeshHeading> <MeshHeading> <DescriptorName MajorTopicYN="Y">Membrane Proteins +</DescriptorName> <QualifierName MajorTopicYN="N">chemistry</Qualifi +erName> <QualifierName MajorTopicYN="N">metabolism</Qualif +ierName> </MeshHeading> <MeshHeading> <DescriptorName MajorTopicYN="N">Models, Molecular +</DescriptorName> </MeshHeading> <MeshHeading> <DescriptorName MajorTopicYN="N">Multiprotein Comp +lexes</DescriptorName> </MeshHeading> <MeshHeading> <DescriptorName MajorTopicYN="Y">Protein Conformat +ion</DescriptorName> </MeshHeading> </MeshHeadingList> <NumberOfReferences>38</NumberOfReferences> </MedlineCitation> <PubmedData> <History> <PubMedPubDate PubStatus="received"> <Year>2005</Year> <Month>6</Month> <Day>7</Day> </PubMedPubDate> <PubMedPubDate PubStatus="revised"> <Year>2005</Year> <Month>6</Month> <Day>18</Day> </PubMedPubDate> <PubMedPubDate PubStatus="accepted"> <Year>2005</Year> <Month>7</Month> <Day>8</Day> </PubMedPubDate> <PubMedPubDate PubStatus="pubmed"> <Year>2005</Year> <Month>7</Month> <Day>26</Day> <Hour>9</Hour> <Minute>0</Minute> </PubMedPubDate> <PubMedPubDate PubStatus="medline"> <Year>2005</Year> <Month>12</Month> <Day>13</Day> <Hour>9</Hour> <Minute>0</Minute> </PubMedPubDate> </History> <PublicationStatus>ppublish</PublicationStatus> <ArticleIdList> <ArticleId IdType="pii">S0959-440X(05)00124-7</Article +Id> <ArticleId IdType="doi">10.1016/j.sbi.2005.07.001</Art +icleId> <ArticleId IdType="pubmed">16039843</ArticleId> </ArticleIdList> </PubmedData> </PubmedArticle>
I get the error Only Comments, PIs and whitespace allowed at end of document [Ln: 40, Col: 1]
Line 40 is the line when the other </PubmedArticle> element begins... Do I have any mistake?

Replies are listed 'Best First'.
Re^2: Perl & Simple::XML--->SOS
by erroneousBollock (Curate) on Oct 06, 2007 at 08:55 UTC
    Do I have any mistake?
    I don't think you can have 2 root-nodes in an XML document.

    -David

      No, thanks, it was a silly typo mistake... :) It works ok now, but the problem remains with the authors...
      In particular, the above code (which I created based on your help) works fine when I have more than one author. But, if I have only one, it says :  Not an ARRAY reference at read_xml.pl line 19 (line 19 is the foreach loop).
      So, I was wondering, is there a way of finding out if I have one or more authors in my xml entry? Thank you all for your time!