in reply to Multithreaded xml writing using XML::Writer

No. No. Yes. Yes, by starting again and doing things differently.

Sharing an object ($writer), via closure, means that each thread gets its own copy of the object. Now if the object needs to remember any information internally, say about what it has already done, then each copy will not know what the other copy has done, and both will get very confused.

Also, lock() can only be used upon shared variables. And since you aren't using threads::shared, you should probably be getting warnings about the way you are using it. If you aren't, then it must mean you aren't using warnings either, and that is a very unwise decision if you intend to use threading.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
RIP an inspiration; A true Folk's Guy

Replies are listed 'Best First'.
Re^2: Multithreaded xml writing using XML::Writer
by DreamT (Pilgrim) on May 03, 2010 at 11:47 UTC
    Ok, I will have a look at threads::shared.
    But other than that, is it a good thing to use XML::Writer in a multithread fashion?

      I wonder what you try to gain by using multiple threads with what is essentially an IO-bound operation.

      I wouldn't try to do IO on the same channel with more than one thread. If you want to create XML fragments and output them in parallel, I would create the fragments in separate threads and send them to one output thread via Thread::Queue.

        Well, in theory the data sources are relatively slow, so if I fetch the data from them in parallell, I can gain some time, I think.
        I will look into Thread::Queue.
      is it a good thing to use XML::Writer in a multithread fashion?

      The way you are doing it, no.

      Since both threads are writing to the same file, and they cannot safely both be writing at the same time, you would have to serialise them to prevent them intermingling their outputs. And that means there would be nothing gained by threading that paret of the process.

      But, reading the subtext of your question, the gain you are hoping for is not in the writing of the XML output, but in the sourcing of the data written. That is to say, you imply that you are fetching data from two (or more) sources. As you do not show where or how you are sourcing that data, it is impossible to say whether there would be any advantage in using threading for that part of the process.

      If, for example, you are fetching the data from two different servers, there may be some gains to be had by overlapping the fetching of the data, and then feeding the data fetched back to a single thread for writing to a file as XML.

      But you'll need to describe the whole process from end to end to get good wisdom on that.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        The source data comes from 3 mysql-databases, residing on the same server. There are quite heavy load on them already, which means that there might be quite slow access to them. And, theres quite much data that is to be exported. So, to gain some time I want to fetch data from them in parallell, using multiple database-handles (Possibly, I will end up with the same end time since the data resides on the same server, what do you think?)

        My thought here was to dump the data to file as quick as possible so that the server could keep as little data in memory as possible (of course there can be a possibility that my current "way" of using xml::writer isn't memory effective, who knows). So yes, I want to fetch the data in parallell and write it to the output file.