It's more generic in the sense that it doesn't assume that the root tag of the first document is foo. As you did not mention any constraint on the size of the documents, then I did not assume any.
If there are contraints, for each individual file or for the resulting file, then you should mention it and the solution would be different. Depending on the constraints in terms of speed and potentialsize of the documents, the best solution could be regexp based (that could be made quite robust, provided your XML files do not include DTDs), XML::LibXML based (if individual files are not too big to be loaded in memory), XML::Parser based (rather easy if no DTD is used, a bit more complicated otherwise) or XML::Twig based (slower, but you could deal with arbitrary sized documents, and that would be quite easy to code, although a bit more complex than the examples I gave previously).
But all those potential constraints would be part of the requirements for your code, so you would have to express them if you want a response that really fits your problem.
And no, I am not trying to confuse you to cover up the fact that my answer wasn't that smart ;--)
| [reply] |
To clear things up a bit: all xml files to be concatenated have to have the same, known root tag (and the script will probably warn if it's wrong, and skip that file).
Additionally the result file will surely fit into memory (I'd be very surprised if a result file even approaches 50MB), so no need to worry.
And yes, there is a DTD available, but sadly it doesn't match the required structure of the XML files. The developers "solved" this problem by not referencing the DTD.
| [reply] |
OK, so either an XML::LibXML or an XML::Twig solution will work.
Your comment "there is a DTD available, but sadly it doesn't match the required structure of the XML files. The developers "solved" this problem by not referencing the DTD" is both hilarious, sad, and even more sadly not surprising at all. A _LOT_ of XML in the Real World (tm) is of
tragically bad quality.
| [reply] |