in reply to Re: How to speed up/multi thread extract from txt files?
in thread How to speed up/multi thread extract from txt files?

I first load the hash with only the test name that i'm interested to extract then use the code below to do the actual extraction into a text file.
i've ran just a while loop going through the file, and it takes about 25secs, and when i ran it through my code below, it takes about 52secs.
any other helpful advice? thanks~
## extract all the param values from the stdf sub get_paramValue { my ($stdf,$lot,$oper,$sum,%param_flag) = @_; my ($output); print "Running with stdf:$stdf.\n"; &log("get_paramValue","Running with stdf:$stdf."); if(-e $stdf){ ## create the output file name, similar to the stdf name but w +ith .log ext $output = $outputdir.$lot."_".$oper."_".$sum.".log"; open(OUT, ">$output") or &log("get_paramValue","Can't write to + output: $output"); print OUT "tname,idx,param_val\n"; open(STDF, $stdf) or &log("get_paramValue","Die can't read fro +m stdf:$stdf."); my (@tmp,$testname,$testFound,$paramVal,$unit_count); while(<STDF>){ if(/3_prtnm_/){ @tmp = split(/3_prtnm_/); $unit_count = &trim($tmp[1]); } elsif(/2_tname_/){ @tmp = split(/2_tname_/); $testname = &trim($tmp[1]); if(exists $param_flag{$testname}){ $testFound = 1; } } elsif($testFound){ if(/2_mrslt_/){ @tmp = split(/2_mrslt_/); $paramVal = &trim($tmp[1]); print OUT "$testname,$unit_count,$paramVal\n"; $testFound = 0; } } } ## END while close(STDF); close(OUT); } ## END IF return $output; } ## end sub

Replies are listed 'Best First'.
Re^3: How to speed up/multi thread extract from txt files?
by weismat (Friar) on Jan 09, 2008 at 17:01 UTC
    I have not really understood your code, but sometimes it helps a lot to work with call by references instead of call by value.
    Call by value will mean one additional copy of your data will be done when a function is called.
    Given your amount of data this can speed things up big time.
Re^3: How to speed up/multi thread extract from txt files? (Updated)
by BrowserUk (Patriarch) on Jan 09, 2008 at 17:47 UTC

    Ignore (most) of this!

    That's the trouble with running code in your head. You don't always notice scoping issues. And answer to the question at the end would still help though.

    Did you tidy your code up for posting? I ask because there is a logic error in what you've posted that (I think) means that it cannot do what you are wanting it to do.

    if(/3_prtnm_/){ @tmp = split(/3_prtnm_/); $unit_count = &trim($tmp[1]); } elsif(/2_tname_/){ @tmp = split(/2_tname_/); $testname = &trim($tmp[1]); if(exists $param_flag{$testname}){ $testFound = 1; } } elsif($testFound){ if(/2_mrslt_/){ @tmp = split(/2_mrslt_/); $paramVal = &trim($tmp[1]); print OUT "$testname,$unit_count,$paramVal\n"; $testFound = 0; } }
    1. You will only ever print anything to the output file if $testFound is true.
    2. But $testFound is only ever set true inside another branch of the same if/else cascade?
    3. The same is also true for the values of both $testname and $unit_count

    Also, how many keys are there in %param_flag and what do they look like?

    If you can clarify those, I'll try and adapt the logic of your subroutine to use the big string technique I mentioned above.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re^3: How to speed up/multi thread extract from txt files?
by BrowserUk (Patriarch) on Jan 09, 2008 at 23:38 UTC

    In order to be able to help you further I would need to see a short example of the contents of the input file. And also the number of keys in %param_flag and what they look like.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.