in reply to Re^2: problems with Xpath
in thread problems with Xpath
Also changed your code to cycle through the LabelValue nodes and call getAttribute on them.<NCI_PID_XML> <Ontology> <LabelType name="function" id="8"> <LabelValueList> <LabelValue name="acyltransferase activity" id="8007" parent_i +dref="8000" GO="GO:0008415" /> <LabelValue name="calcium- and calmodulin-responsive adenylate + cyclase activity" id="8008" parent_idref="8000" GO="GO:0008294" / > <LabelValue name="casein kinase I activity" id="11900" parent_ +idref="8000" GO="GO:0004681" /> <LabelValue name="casein kinase activity" id="9634" parent_idr +ef="8000" GO="GO:0004680" /> <LabelValue name="function" id="75" parent_idref="75" /> <LabelValue name="guanylate cyclase activity" id="8009" parent +_idref="8000" GO="GO:0004383" /> <LabelValue name="interleukin-12 receptor activity" id="8015" +parent_idref="8000" GO="GO:0016517" /> <LabelValue name="metalloendopeptidase activity" id="8010" par +ent_idref="8000" GO="GO:0004222" /> <LabelValue name="molecular_function" id="8000" parent_idref=" +75" GO="GO:0003674" /> <LabelValue name="potassium channel inhibitor activity" id="10 +264" parent_idref="8000" GO="GO:0019870" /> <LabelValue name="protein serine/threonine phosphatase activit +y" id="8002" parent_idref="8000" GO="GO:0004722" /> <LabelValue name="protein tyrosine phosphatase activity" id="8 +013" parent_idref="8000" GO="GO:0004725" /> <LabelValue name="retinol isomerase activity" id="8012" parent +_idref="8000" GO="GO:0050251" /> <LabelValue name="serine protease" id="8003" parent_idref="800 +0" /> <LabelValue name="specific transcriptional repressor activity" + id="8019" parent_idref="8000" GO="GO:0016566" /> <LabelValue name="telomeric DNA binding" id="8005" parent_idre +f="8000" GO="GO:0042162" /> <LabelValue name="transcription factor activity" id="8018" par +ent_idref="8000" GO="GO:0003700" /> <LabelValue name="transcription repressor activity" id="8017" +parent_idref="8000" GO="GO:0016564" /> <LabelValue name="tumor necrosis factor receptor activity" id= +"8004" parent_idref="8000" GO="GO:0005031" /> </LabelValueList> </LabelType> </Ontology> </NCI_PID_XML>
And on the nci.txt file I now have#!/usr/bin/perl use strict; use XML::XPath; my $file = "nci.xml"; my $xp = XML::XPath-> new(filename => $file); open(info,"+>nci.txt"); foreach my $concept ($xp->findnodes('/NCI_PID_XML/Ontology/LabelType') +) { my $parentid = $concept->getAttribute('id'); my $type = $concept->getAttribute('name'); foreach my $LabelValue ( $concept->findnodes('LabelValueList/Label +Value')) { my $id = $LabelValue->getAttribute('id'); my $name = $LabelValue->getAttribute('name'); my $goid = $LabelValue->getAttribute('GO'); print info "$parentid\t"; print info "$type\t"; print info "$id\t"; print info "$name\t"; print info "$goid\n"; } } close info;
8 function 8007 acyltransferase activity GO:000 +8415 8 function 8008 calcium- and calmodulin-responsive ade +nylate cyclase activity GO:0008294 8 function 11900 casein kinase I activity GO:000 +4681 8 function 9634 casein kinase activity GO:0004680 8 function 75 function 8 function 8009 guanylate cyclase activity GO:000 +4383 8 function 8015 interleukin-12 receptor activity + GO:0016517 8 function 8010 metalloendopeptidase activity GO:000 +4222 8 function 8000 molecular_function GO:0003674 8 function 10264 potassium channel inhibitor activity + GO:0019870 8 function 8002 protein serine/threonine phosphatase a +ctivity GO:0004722 8 function 8013 protein tyrosine phosphatase activity + GO:0004725 8 function 8012 retinol isomerase activity GO:005 +0251 8 function 8003 serine protease 8 function 8019 specific transcriptional repressor act +ivity GO:0016566 8 function 8005 telomeric DNA binding GO:0042162 8 function 8018 transcription factor activity GO:000 +3700 8 function 8017 transcription repressor activity + GO:0016564 8 function 8004 tumor necrosis factor receptor activit +y GO:0005031
|
|---|