I am have 49 very long ascii-files (hundreds of thousands of lines). For each of this 49 files, I want to cp the first 1000 lines (1024 exactly) to be copied to a new file (say temp files 1 till 49). Subsequently use these files (processing stuff) and than cp the second 1000 rules from each file to 49 temp files, and so on and on till I've reached the end ot the long ascii-files (all having exact the same lengt).
I already have some code working, but the problem is, it is much too slow... In the following code (found on some man-page), for your understanding: $span is a constant @stations has length 49 $lower_bound gets increased by value of 1024 just as $upper_bound for every while loop ($nr_of_samples = 1024). Furthermore, $length is the total number of lines of the 49 huge ascii files.
As you can see for every line (loop from $lower_bound to $upper_bound), the files need to be openened, closed and both subs need to be called. There must be a faster way I think, so some advice would be higly appreciated :)
The subroutines:while(){ if($cnt % $nr_samples == 0){ $lower_bound = $lower_bound+$nr_samples; $upper_bound = $upper_bound+$nr_samples; } $cnt++; last if $upper_bound > ($length-1); foreach my $station(@stations){ open OUT, ">$span.$station.alpha.sac.data"; print OUT "2 $nr_samples\n"; for my $seeking($lower_bound..$upper_bound){ my $eval_file_2 = $suffix ? sprintf"%s%s_%s_%s",$prefix,$suffi +x,$station,$span : sprintf"%s_%s_%s",$prefix,$station,$span; open(FILE, "< $eval_file_2") or die "Can't open $eval_file_2 f +or reading: $!\n"; open(INDEX, "+>$eval_file_2.idx") or die "Can't open $eval_fil +e_2.idx for read/write: $!\n"; build_index(*FILE, *INDEX); my $line = line_with_index(*FILE, *INDEX, $seeking); close FILE; close INDEX; chomp $line; my($time,$value)=split(/\s+/,$line); printf OUT "%.3f %.10f\n",$time,$value; } close OUT; } }
sub build_index { my $data_file = shift; my $index_file = shift; my $offset = 0; while (<$data_file>) { print $index_file pack("N", $offset); $offset = tell($data_file); } } sub line_with_index { my $data_file = shift; my $index_file = shift; my $line_number = shift; my $size; # size of an index entry my $i_offset; # offset into the index of the entry my $entry; # index entry my $d_offset; # offset into the data file $size = length(pack("N", 0)); $i_offset = $size * ($line_number-1); seek($index_file, $i_offset, 0) or return; read($index_file, $entry, $size); $d_offset = unpack("N", $entry); seek($data_file, $d_offset, 0); return scalar(<$data_file>); }
In reply to Accessing files at certain line number by Utrecht
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |