frednc_2014 has asked for the wisdom of the Perl Monks concerning the following question:
I am trying to split a large file into an array of record using $/. My split.pl is the following
part of the input file is like:#!/usr/bin/perl -w $input_blast = $ARGV[0]; $/ = "\nQuery=\s+"; open (IN, "$input_blast"); while (<IN>) { chomp; @blastblock = split(/Query=/, $_); } $total_number = 0; foreach $blastresult (@blastblock) { next if ($blastresult =~ /^BLASTN/); $total_number++; print "$total_number ----------------------$blastresult\n"; }
Database: NLngoRT_WT_PCRDB 1 sequences; 481 total letters Query= M01133:26:000000000-A6UCG:1:1101:22656:1128 1:N:0:1+@M01133:26:000000000-A6UCG:1:1101:22656:1128 2:N:0:1 Length=501 ... (more content here, I deleted them so that this message is small) Query= M01133:26:000000000-A6UCG:1:1101:22656:1130 1:N:0:1+@M01133:26:000000000-A6UCG:1:1101:22656:1130 2:N:0:1 Length=501 ... (more content here, I deleted them so that this message is small) -------------
This is a huge file. It has 72320825 lines. I set the record separator on "Query= " and then split the file into an array. When I ran my split.pl on it, I got this error message: "Split loop at ../DIR_TEST/split.pl line 12, <IN> chunk 1." It did not produce an array. However, if I created a smaller file by using "head -54725379 >small.txe". It worked fine. If I added one more line using "head -54725380 >small_plus_1.txe". I got the same error message as the original file. Not sure what causes this. Somehow it is related to the split function. Thanks.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: input record separator and split
by toolic (Bishop) on May 28, 2014 at 19:44 UTC | |
|
Re: input record separator and split
by Laurent_R (Canon) on May 28, 2014 at 21:21 UTC | |
|
Re: input record separator and split
by taint (Chaplain) on May 29, 2014 at 02:07 UTC | |
by frednc_2014 (Initiate) on May 29, 2014 at 13:28 UTC | |
|
Re: input record separator and split
by Lotus1 (Vicar) on May 31, 2014 at 15:34 UTC |