in reply to Re: How to use threads to write worksheets of excel
in thread How to use threads to write worksheets of excel

it's about 60,000,000 data need write 3 sheets.
  • Comment on Re^2: How to use threads to write worksheets of excel

Replies are listed 'Best First'.
Re^3: How to use threads to write worksheets of excel
by BrowserUk (Patriarch) on Jul 01, 2015 at 04:48 UTC

    Excel::Writer::XLSX isn't thread-safe, so I'd tackle the problem this way:

    #! perl -slw use strict; use threads; use Thread::Queue; use Excel::Writer::XLSX;; sub worker { my( $Q, $region, $sql ) = @_; my $wb = Excel::Writer::XLSX->new("$region.xlsx"); $wb->set_properties( title => 'Node List', author => 'L_WC de +mo', comments => 'Node List' ); $wb->set_optimization(); my $fmt = $wb->add_format( bg_color => 44 ); $fmt->set_align('cent +er'); my $ws = $wb->add_worksheet( $region ); my $row = 0; while( $Q->dequeue ) { $ws->write( $row++, 0, $_ ); } $wb->close; } my @nodeList = ( ['AMS', 'a'], ['APJ', 'ap'], ['EMEA', 'e'] ); my @Qs = map Thread::Queue->new, 1 .. 3; my @threads = map threads->new( \&worker, $Qs[ $_ ], @{ $nodeList[ $_ +] } ), 0 .. 2; for( 1 .. 1e6 ) { my $sheet = int( rand 3 ); $Qs[ $sheet ]->enqueue( $nodeList[ $sheet ][1] . $_ ); } $Qs[ $_ ]->enqueue( undef ) for 0 .. 2; $_->join for @threads;

    That writes 1 million items split between 3 workbooks in around 1 minute. Once the 3 files are produced, it should be simple to merge the single sheets from the 3 zipfiles into one workbook by using Excel itself. (I don't have Excel, so I couldn't test that hypothesis.)


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
    I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!

      Nice++ BrowserUk. Closing Excel::Writer::XLSX is time consuming. Your demonstration does that in parallel which is beneficial for millions of rows.

        Probably the most important thing in my demo code for efficiency is: $wb->set_optimization();.

        With it, the xml gets written on the fly; without it, the whole thing gets accumulated in memory and written as a big spew on close


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!