baxy77bax has asked for the wisdom of the Perl Monks concerning the following question:
i need a help with this one
i would like to speed up my reading from a file. i know i can do that by first splitting a file in several small files and then fork the reading process. is there any other, more elegant way to do this.
why am i trying to do this? well the file i'm dealing with is taking a line from a file and processes it . the processing is what is slowing my procedure down, and because i'm while-looping through a file, my idea was to split a file and then fork the whole procedure for every peace of a file. putting a file in memory is not an option(file is too big for my PC)
so what i'm asking for is a different point of view on this problem of mine. a different idea...
thnx
this is just an example of what i do to speed up my work!pseudocode open (original_file); $counter = 0; while(original_file){ $counter++; } my $peace = $counter/4; # let say the file has an equale number of li +nes my $count_for_peace = 0; open (file_part); while(original_file){ if ($count_for_peace == $peace){ close file_part_handled; open(file_part_new); $count_for_peace = 0; } print into file_part_handled $count_for_peace++; } my @ch; for(1..4){ my $pid = fork(); if ($pid){ push(@ch,$pid); } elsif($pid ==0){ #read from file 1 and do some processing exit; } else{ die error; } } foreach (@ch){ waitpid($_,0); }
Update:
#read from file 1 and do some processing
i realy didn't benchmark that but what really happens here is the line is read, through regex the number is identified and then this number is looked for in the in-memory hashed table. and then according to some correlated value from that table some quick statistical corection is calculated for that value(FDR). so basicly what i was thinking of when trying to speed things up is to divide my calculation and regex identification through several cores (CPU's are on 100% when i do my parallelization as mentioned). i'll do some benchmarking later and post the results
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: fork IO
by Corion (Patriarch) on Jun 08, 2009 at 08:51 UTC | |
|
Re: fork IO
by jethro (Monsignor) on Jun 08, 2009 at 08:45 UTC | |
|
Re: fork IO
by cdarke (Prior) on Jun 08, 2009 at 08:53 UTC | |
|
Re: fork IO
by BrowserUk (Patriarch) on Jun 08, 2009 at 13:19 UTC |