Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I am using XML::Parser to parser xml document. If size of xml doc is large then my program gives out error otherwise it works fine. I was wondering is there any memory limit with XML::parser.

Replies are listed 'Best First'.
Re: XML::Parser query
by tinman (Curate) on Apr 13, 2001 at 01:46 UTC

    Speaking from my personal experience, it's unlikely that its any problem with XML::Parser per se..

    I finished parsing a 1.46 gb (yes, gigabytes :o) XML file using XML::Parser on Friday.. it takes close to 40 minutes, mine is not the most efficient code out there, but my machine only has 128 mb RAM, and about 600 mb in combined page files... This is on Windows 2000, btw.. so, some more information on the type of error might help..because I figure, if XML::Parser can handle a gigabyte or so of data, then its good for pretty much anything :o)

Re: XML::Parser query
by mirod (Canon) on Apr 13, 2001 at 14:48 UTC

    You will have to give us the error message, XML::Parser does not care about the size of the overall file, just about the size of the current stack of open elements. So the only way you can get it to use lots of memory is by having a very, very, VERY, deep document.

    You can do a top while the document is being parsed and see if that's the problem, but I doubt it very much.

    Big documents tend to come with a higher risk of real XML error than small ones ;--(

Re: XML::Parser query
by ok (Beadle) on Apr 13, 2001 at 01:35 UTC
    1. What is the error?
    2. How much is "large?"
    3. How much memory are you working with?
      Seems like problem is not size, the content of a single element is causing two successive calls to the character handler each with a part of the complete string. And surprisingly i am parsing approx 3000 lines and it split only on 1 line and that causes wrong argument pass to the next program and my program dies. Do you have any idea about that?Thanks for all your reply.

        This is a documented behaviour of XML::Parser. Please read the docs. Due to its buffering strategy this can happen.

        See the review (XML::Parser) for a way to cope with it.