Hi everyone, I am rather new to perl and was hoping you could help me. I need to parse a file that's in an xml type format (not full xml syntax) and pull out data to run different types of jobs. The file looks like this:

<PROJECT_ID>12345</PROJECT_ID> <JOBID>101</JOBID> <TYPE1>add</TYPE1> <FILE1>/tmp/file_data_gros</FILE1> <JOBID>102</JOBID> <TYPE2>delete</TYPE2> <FILE2>/tmp/file_myvalues</FILE2>

The above file will always have a PROJECT_ID number and can have multiple JOBID fields defined in it. Each JODID will always have a corresponding TYPE* and FILE* associated with it. What I am trying to write is something that can loop through multiple files like this. So when each file is processed the code would pull out the PROJECT_ID field value and then for each JOBID (in order) it would pull out the corresponding values for TYPE* and FILE*. So in the above example the code would pull out the PROJECT_ID value and then get the value of the first JOBID seen in the file and once it has this it would then get the value for TYPE1 and FILE1 and then output a string in the following format with the values for each field:

PROJECT_ID(value) JOBID(value) TYPE1(value) FILE1(value)

.....I would then do some processing with these values. The loop would then carry onto the next JOBID seen and then output the values for these fields:

PROJECT_ID(value) JOBID(value) TYPE2(value) FILE2(value)

.....I would then do some processing with these values The loop would then carry on to the next JOBID seen and do the same until there are no more to process within this file

I am really struggling with how to go about doing this. I was thinking maybe I should read the whole file into a hash but I not too sure that this is the right approach. I have written this so far.

open FILE, "$file_to_process" or die; my %hash; while (my $line=<FILE1>) { chomp; (my $xmltag, $xmlvalue) = split /\<|\>/, $line; $hash{$xmltag} = $xmlvalue; }

If anyone can help me with some code that would be able to do what I need I would greatly appreciate it. My attempt is not working at all :-(


In reply to Parsing an file that has xml like syntax by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.