dannyjmh has asked for the wisdom of the Perl Monks concerning the following question:
Hey there, monks. I think i'm having a buffering issue since i need to read and parse big text files (created by myself in previous lines of the code) to finally print things in another file. At some point, after reading a file with 90855 lines, the script is not reading a line of the next file completely. I have counted the number of characters read until this happens: 233467, and therefore tried to flush the buffer and sleep before reading the next line of the file. Doesn't work. Any suggestion, please? thanks a lot. The part of the code coming:
for my $o (0..1){ if ($o==0){ @files = reverse <*_SITES_3utr>; }else{ @files = reverse <*_SITES_cds>; } undef(%pita_sites_nu);undef(%pita_tot_score);my($comp_p);undef(%allowe +d_wobbles);#undef(%site_nu); foreach $i(@files){ my $buff=0; print "Analyzing $i\n";sleep(1); $program= $1 if $i=~ /(\w+)_SITES/; open(FIL, $i) or die "$!: $i\n"; while(<FIL>){ $buff += length($_); if ($buff >= 230000){$buff=0;sleep(1);select( +(select(FIL), $|=1)[0]);} #FLUSH THE BUFFER, NOT WORKING!!! undef($a); unless($.== 1){ if ($o==0){ if (/^\d+\t(\S+)\t(\S+)\t(\d+)\t(\d+)\t(\S+)\t(\S+)\t(.*)/){ $mirna= $1; $target= $2; $start= $3; $end= $4; $site= $5; $c +omp_p= $6;$a= $7;$j= "${mirna}_${target}_${start}_$end"; $site_nu{$j}= "$mirna\t$target\t$start\t$end\t$site\t$comp_p +";#Store each site in a hash }else{die "$buff characters, in line $.:$_\n"} #DIES HERE!!! }else{ if (/^\d+\t(\S+)\t(\S+)\t(\d+)\t(\d+)\t(\S+)\t(.*)/){ $mirna= $1; $target= $2; $start= $3; $end= $4; $site= $5;$a= + $6;$j= "${mirna}_${target}_${start}_$end"; $site_nu{$j}= "$mirna\t$target\t$start\t$end\t$site";#Store +each site in a hash } }
It dies at the "DIES HERE!!" die, after reading 3413 characters of the second file. Happens because the regex doesn't work since only half of the line is in $_. Help please! Thanks again.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: buffering issue?
by moritz (Cardinal) on Mar 31, 2013 at 13:42 UTC | |
by dannyjmh (Novice) on Mar 31, 2013 at 17:52 UTC | |
by choroba (Cardinal) on Apr 02, 2013 at 09:34 UTC |