supriyoch_2008 has asked for the wisdom of the Perl Monks concerning the following question:
Hi Perl Monks,
I am a beginner in perl programming. I have written a PERL program to count the number of bases in a DNA molecule from a text file in command prompt in Windows XP. The program works well with small text files having DNA sequence data and gives correct results.
But when I tried to count the number of bases from a 299 MB text file, the program shows “out of memory” in cmd after nearly 2 minutes. One Perl monk suggested me to use Tie: : File to solve this problem. I looked for the syntax of Tie : : File in internet but could not make out how to use it in my program with <STDIN> input operator assigned to a scalar variable like $DNAfilename=<STDIN>;
I have given the perl program below. May I expect to get help from any perl monk to sort out this problem in my program either using Tie : : File or other method so that I can analyze a 299MB text file. I have a 2-GB RAM in my laptop with Active Perl 5.10.1 Build 1007 version.
Can I use a code like %mem=300 MW; in perl because this type of code is used by some theoretical and physical chemists in specific programs based on c programming for “Quantitative Structure Analysis and Report (QSAR)” for biomolecules to solve memory problem?
#!usr/bin/perl print "\n\nPlease type the filename of the DNA sequence data: "; $DNAfilename=<STDIN>; chomp $DNAfilename; unless ( open(DNAFILE, $DNAfilename) ) { print "Cannot open file \"$DNAfilename\"\n\n"; exit; } while(@DNA= <DNAFILE>) { $DNA=join('',@DNA); close DNAFILE; # Remove whitespace $DNA=~ s/\s//g; # Remove whitespace $DNA=~ s/\s//g;# Line 15 # Count number of bases $b=length($DNA); print "\nNumber of bases: $b.\nDoes the value tally with GenBank recor +d? If yes,continue."; # Count number of each base and nonbase $A=0;$T=0;$G=0;$C=0;$e=0; # Line 20 while($DNA=~ /A/ig){$A++} while($DNA=~ /T/ig){$T++} while($DNA=~ /G/ig){$G++} while($DNA=~ /C/ig){$C++} while($DNA=~ /[^ATGC]/ig){$e++} # Line 25 print "\nA=$A; T=$T; G=$G; C=$C; Errors(N)=$e.\n."; } exit;
I have given the cmd output below.
Microsoft Windows XP Version 5.1.2600
(C) Copyright 1985-2001 Microsoft Corp.
C:\Documents and Settings\user>cd d*
C:\Documents and Settings\user\Desktop>m.pl
Please type the filename of the DNA sequence data: manjur.txt
Out of memory!
C:\Documents and Settings\user\Desktop>
I am ever grateful to perl monks for their quick reply with suggestions.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Seeking help for using Tie : : File in my perl program for counting bases using Active Perl 5.10.1 Build 1007
by GrandFather (Saint) on Jan 30, 2012 at 10:32 UTC | |
|
Re: Seeking help for using Tie : : File in my perl program for counting bases using Active Perl 5.10.1 Build 1007
by rovf (Priest) on Jan 30, 2012 at 08:57 UTC | |
|
Re: Seeking help for using Tie : : File in my perl program for counting bases using Active Perl 5.10.1 Build 1007
by Anonymous Monk on Jan 30, 2012 at 06:47 UTC |