RuthlessRonin has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I am relatively new at perl, but I am trying to write a piece of code that will take an outputted .xml file from a previous program which contains several directories. I want the program to scan for the line that contains <Worksheet> which is where I want it to start reading and recording the information from the first directory, stop on </Workbook> of directory1 .xml file. Then scan the next directory ( directory 2 lets call it) for its first line that contains <Worksheet> and then read and record into the new output file until it reaches the line that contains</Workbook> on that directory, and keep looping and doing this for several directories. And then taking all this and making it into one output file. Can anyone help me out? Thanks a bunch.
  • Comment on readings data, taking parts, outputting parts in a file

Replies are listed 'Best First'.
Re: readings data, taking parts, outputting parts in a file
by Fletch (Bishop) on May 28, 2009 at 19:17 UTC
Re: readings data, taking parts, outputting parts in a file
by wjw (Priest) on May 29, 2009 at 00:08 UTC
    I agree wholeheartedly with the previous comment on how to ask a question. However, what you are asking here is sort of interesting to me, so I want to ask some questions about your question, so to speak.
    • The xml files are located in various directories of the program that generates them?
    • When you refer to directory, are you referring to a file named 'directory.xml', or a directory that contains some otherwise named xml file(s)?
    • Making a wild guess here: you want to take multiple xml files generated by a spreadsheet program and concatenate all the worksheets contained in each workbook of the original spreadsheet files into one workbook file?

    Lets assume that last is true. I think perhaps XML::Twig might help you do what you want. On the other hand, it should not be too tough to simply open each xml file, grab everything between <worksheet> and the </workbook> and plop it in a new file either. But that depends on what you want to do, which you really do need to clarify...
    Best of luck.. :-)
    • ...the majority is always wrong, and always the last to know about it...
    • The Spice must flow...
    • ..by my will, and by will alone.. I set my mind in motion
Re: readings data, taking parts, outputting parts in a file
by arc_of_descent (Hermit) on May 29, 2009 at 07:39 UTC

    Read about the Range Operators, especially the part where it explains how to use it in scalar context. This method can be applied to get at well defined blocks in your text file.

    You should also take a look at XML::Simple which help in XML Parsing. If your xml files are huge, I recommend XML::Parser instead, but that would require some more effort to learn. Also check out XML::Parser::Lite.

    Sample code for using the range operators.

    #!/usr/bin/perl use strict; use warnings; while (<DATA>) { print if (/<marker>/ .. /<\/marker>/); } __DATA__ In scalar context, ".." returns a boolean value. The operator is bista +ble, like a flip-flop, and emulates the line-range (comma) operator of sed, + awk, and various editors. <marker> Each ".." operator maintains its own boolean state. It is false as long as its left operand is false. </marker> Once the left operand is true, the range operator stays true until the right operand is true, AFTER which + the range operator becomes false again.

    --
    Rohan