arun10427 has asked for the wisdom of the Perl Monks concerning the following question:

Hey all, Need help with this problem 1. I have a huge data file which consists of millions of records. 2. There is a chunk column which has chunks of counters in it Ex - Chunk Processor 1 1 1 2 1 2 1 3 1 0 1 2 f 1 f 2 f 4 f 4 s 6 s 3 s 2 s 1 f 3 g 2 g 6 I wanna do some calculations in processor column based on chunk column ..dat is for all 1s, I perform a function in the processor column..for all fs I perform function in processor column..I am confused as to where to place the while loop? while( the data belongs to one chunk) { } ? Could u please help me with it? I would really appreciate it. :)

Replies are listed 'Best First'.
Re: Confusion on how to place while loop!
by desemondo (Hermit) on Jan 28, 2010 at 03:00 UTC
    Assuming your data file looks like this:

    Chunk Processor 1 1 1 2 1 2 1 3 1 0 1 2 f 1 f 2 f 4 f 4 s 6 s 3 s 2 s 1 f 3 g 2 g 6
    ... this might be a suitable solution. (Assuming that all data for a particular chunk is grouped together in your data file, and assuming chunk values never repeat...)

    Update: If your data file is different to above, please reply or update your original post with further details - and please format your post using the writeup tips link toolic provided.



    use strict; use warnings; my $chunk_old; my @processor_nums; while (my $line = <DATA>){ my ($chunk, $processor_num) = $line =~ m{(\w*\d*)\s+(\d+)}; if (!defined $chunk_old){$chunk_old = $chunk}; if ($chunk eq $chunk_old){ push @processor_nums, $processor_num; #still on the same chu +nk, append processor number to list. } else { my $result = do_calculation(\@processor_nums); #we've got all + processor numbers for this chunk, now calculate. print "$result\n"; #or do something e +lse with it... @processor_nums = (); #empty the list, as we +'re starting with the new chunk. push @processor_nums, $processor_num; #save the processor nu +mber we've read but not yet used in calculation $chunk_old = $chunk; #update the chunk flag + to the new chunk value } } sub do_calculation { my $processor_list_ref = shift; my $total = 0; for my $value(@$processor_list_ref){ #do calculation stuff here.... $total = $total + $value; } return $total; #whatever result(s) you get } __DATA__ 1 1 1 2 1 2 1 3 1 0 1 2 f 1 f 2 f 4 f 4 s 6 s 3 s 2 s 1 f 3 g 2 g 6
Re: Confusion on how to place while loop!
by toolic (Bishop) on Jan 28, 2010 at 02:02 UTC
    I am confused as to the stucture of your data file.

    Please edit your node, using code tags as described in Writeup Formatting Tips. The small sample of data you showed is all on one line. Does that represent a single line of your data file? Help us to help you.

    Show as small a sample of your data file as possible (maybe 10 lines, and 10 columns, or so).

    There are a few ways to read multiple lines in at a time for processing, but without knowing more details, it is hard to give more detailed advice.