Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks! I have an XML file which is ver mixed (shown below):
<?xml version='1.0' encoding='ISO-8859-1' ?> <!DOCTYPE DEFTABLE SYSTEM "C:\Program Files\BMC Software\CONTROL-M EM +6.2.01\Data\resource\deftable.dtd"> <DEFTABLE > <SCHED_GROUP GROUP="BEPYRG-NA-HKEEP" OWNER="pyruat11" AUTHOR="u038451" CONFIRM="0" JOBNAME="BPYRG000-NA-HKEEP" MAXWAIT="0" MEMNAME="BPYRD007-PYRPACK" DATACENTER="D015-E1200-IB" TABLE_NAME="BEPYRG-NA-HKEEP" ADJUST_COND="0" APPLICATION="BEPYR_NA_PYRPACK" MULTY_AGENT="N" USED_BY_CODE="0" CHANGE_USERID="u147393" CREATION_DATE="20080731" > <OUTCOND NAME="PL-BPYRG000-NA-HKEEP-OK" SIGN="ADD" ODATE="ODAT"/> <AUTOEDIT EXP="%%PYR_INST=eqny" /> <AUTOEDIT EXP="%%ORDER_TABLENAME=BEPYR" /> <AUTOEDIT EXP="%%APPENV=devctm01" /> <TAG APR="1" AUG="1" DEC="1" FEB="1" JAN="1" JUL="1" JUN="1" MAR="1" MAY="1" NOV="1" OCT="1" SEP="1" RETRO="0" SHIFT="IGNOREJOB" MAXWAIT="05" SHIFTNUM="+00" TAG_NAME="CMTWTF__-TAG" WEEKDAYS="1,2,3,4,5" DAYS_AND_OR="OR" /> <TAG APR="1" AUG="1" DEC="1" FEB="1" JAN="1" JUL="1" JUN="1" MAR="1" MAY="1" NOV="1" OCT="1" SEP="1" DAYS="ALL" RETRO="0" SHIFT="IGNOREJOB" MAXWAIT="05" SHIFTNUM="+00" TAG_NAME="CMTWT___S-TAG" WEEKDAYS="1,2,3,4,0" DAYS_AND_OR="AND" /> <JOB APR="1" AUG="1" DEC="1" FEB="1" JAN="1" JUL="1" JUN="1" MAR="1" MAY="1" NOV="1" OCT="1" SEP="1" DAYS="ALL" OWNER="pyruat11" RETRO="0" SHIFT="IGNOREJOB" SYSDB="0" AUTHOR="u038451" CYCLIC="0" NODEID="BEPYR_NA_HKEEP" CMDLINE="Dummy Job" CONFIRM="0" JOBNAME="BPYRD000-PYRPACK-START-HOUSEKEEP" MAXDAYS="0" MAXRUNS="0" MAXWAIT="5" MEMNAME="BPYRD000-PYRPACK-HOUSEKEEP" AUTOARCH="0" CRITICAL="0" INTERVAL="00000M" MAXRERUN="0" PRIORITY="5" SHIFTNUM="+00" TASKTYPE="Dummy" TIMEFROM="0800" WEEKDAYS="1,2,3,4,5" IND_CYCLIC="START" APPLICATION="BEPYR_NA_PYRPACK" DAYS_AND_OR="OR" CHANGE_USERID="u147393" CREATION_DATE="20080731" CREATION_TIME="092311" CREATION_USER="u147393" TAG_RELATIONSHIP="OR" USE_INSTREAM_JCL="0" > <OUTCOND NAME="PL-BPYRD000-PYRPACK-START-HOUSEKEEP-OK" SIGN="ADD" +ODATE="ODAT" /> <TAG_NAMES TAG_NAME="CMTWT___S-TAG"/> </JOB> <JOB APR="1" AUG="1" DEC="1" FEB="1" JAN="1" JUL="1" JUN="1" MAR="1" MAY="1" NOV="1" OCT="1" SEP="1" DAYS="ALL" OWNER="pyruat11" RETRO="0" SHIFT="IGNOREJOB" SYSDB="0" AUTHOR="u038451" CYCLIC="0" NODEID="BEPYR_NA_HKEEP" CMDLINE="Dummy Job" CONFIRM="0" JOBNAME="BPYRD000-PYRPACK-START-HOUSEKEEP" MAXDAYS="0" MAXRUNS="0" MAXWAIT="5" MEMNAME="BPYRD000-PYRPACK-HOUSEKEEP" AUTOARCH="0" CRITICAL="0" INTERVAL="00000M" MAXRERUN="0" PRIORITY="5" SHIFTNUM="+00" TASKTYPE="Dummy" TIMEFROM="0800" WEEKDAYS="1,2,3,4,5" IND_CYCLIC="START" APPLICATION="BEPYR_NA_PYRPACK" DAYS_AND_OR="OR" CHANGE_USERID="u147393" CREATION_DATE="20080731" CREATION_TIME="092311" CREATION_USER="u147393" TAG_RELATIONSHIP="OR" USE_INSTREAM_JCL="0" > <OUTCOND NAME="PL-BPYRD000-PYRPACK-START-HOUSEKEEP-OK" SIGN="ADD" +ODATE="ODAT" /> <TAG_NAMES TAG_NAME="CMTWT___S-TAG"/> </JOB> </SCHED_GROUP> </DEFTABLE>
The jobs tag can repeat any number of times in the XML's im using. I am looking to extract the Jobname and weekdays from the XML and output them in the following format or XML
Jobname = A Weekdays=1,2,3 Jobname = B Weekdays=1,3,5 etc
I have tried using Xpath and XML::Twig but they simply give me the data in the following format
jobnameAWeekdays1,2,3JobnameBWeekdays1,3,5
Cany someone please help me try to output this in the format i want Thanks guys!

Replies are listed 'Best First'.
Re: XML extraction
by Bloodnok (Vicar) on Aug 05, 2008 at 10:39 UTC
    Hi ,

    I can't comment on either Xpath or XML::Twig, but for something as simple as this, XML::Simple would/could do the job - XMLin() reads a valid XML file into a hash and returns the hash ref. e.g. ...

    use XML::Simple; $hash = XMLin("file_name") or die "XMLin() - $!"; foreach (@{$hash->{DEFTABLE}->{SCHED_GROUP}->{JOB}}) { print "$_->{JOBNAME}, $_->{WEEKDAYS}\n"; }
    ... or some such. Obviously, this would require adjustment (to nested for) should there be > 1 SCHED_GROUP...

    Disclaimer: I'm afraid the code is untested - our systems don't have XML::Simple installed as standard:-((

    HTH ,

    At last, a user level that overstates my experience :-))
Re: XML extraction
by dHarry (Abbot) on Aug 05, 2008 at 09:49 UTC

    I recall this is possible with XML::Twig, i.e. retrieve individual values for attributes. Could you post some of your code so we can see what you have tried?

    PS I dont want to start a metaphysical discussion on elements versus attributes but you seem to go a bit over the top with using attributes See for example When to use elements versus attributes.

Re: XML extraction
by toolic (Bishop) on Aug 05, 2008 at 14:52 UTC
    Just to provide some history/context for this thread, this is the 4th time this question has been posted over the last couple days, all presumably by smunro16. Refer to XML::Twig Output Help for actual code posted by the OP. Monks have been busy considering, and janitors have been busy reaping/merging.

    Note of caution: the XML in this OP has a few syntax errors, according to XML::Twig ("+" at the beginnings of lines, and the DOCTYPE line was also problematic).