each 4 lines is a block, all the blocks are alike. all the even lines are the same length. I need to extract random lines in these files. So I build index files for these large text files. like script like below:@HWUSI-EAS1734_0032_FC620F7AAXX:5:1:18184:1176#CGATGT/1 GGATTTCTCGTGGANACCATTTGTTGGTCAANNNNNNNNNNGTGTTNGNCTTCANNGNNATTGAAAATGN +TCATTCGTGGCTATTTTCGCNNNNNATNNNN +HWUSI-EAS1734_0032_FC620F7AAXX:5:1:18184:1176#CGATGT/1 gggfggggfgeeecB```^]gffgegadcgBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB +BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB @HWUSI-EAS1734_0032_FC620F7AAXX:5:1:1934:1185#CGATGT/1 GTCATCCTTAATTANCGTATGTGCTCTTCCTNCNNNNNNNNGCTGCTANTTATTTCTNNGCAGCTTTGCT +CTTATTAGTTACGAACATGCCNNNNTANNNN +HWUSI-EAS1734_0032_FC620F7AAXX:5:1:1934:1185#CGATGT/1 acdad`^ddd^aa^B_\VZZfcfccaffBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB +BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB ..........
the question is, whenever I print lines in large line number, the out put is defective. the print code is like below:if (-e "$ARGV[0].idx") { open (INDEXFQ1, "$ARGV[0].idx") or die $!; } else { open (INDEXFQ1, "+>$ARGV[1].idx") or die $!; build_index(*FQ1, *INDEXFQ1); }
no error output information, but the line in large line number is defective, like below:print OQ10_1 line_with_index(*FQ1, *INDEXFQ1, $line);
can anybody help? thank you! sorry, there two sub for the build_index and line_with_index like below:741:20058#ATCACG/1 GTTCGTGAGAGCTCTAGGTTGTCGTCTCCCAGTCAACTATGGTCGCTGTAACGCGCTGACTT 41:20058#ATCACG/1 dgggg_ddadbaggedbXdd]^[UVYX]XR_BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
sub build_index { my $data_file = shift; my $index_file = shift; my $offset = 0; while (<$data_file>) { print $index_file pack("N", $offset); $offset = tell($data_file); } } sub line_with_index { my $data_file = shift; my $index_file = shift; my $line_number = shift; my $size; my $i_offset; my $entry; my $d_offset; $size = length(pack("N", 0)); $i_offset = $size * ($line_number-1); seek($index_file, $i_offset, 0) or return; read($index_file, $entry, $size); $d_offset = unpack("N", $entry); seek($data_file, $d_offset, 0); return scalar(<$data_file>); }
In reply to index for large text file by cafeblue
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |