Hi, Basically I have written a script to read in a file full of entries, and call either a present or absent call based on the constituents of each entry. If a particular entry is EVER called present, it is subsequently always output as present. To store the data I was using a DBM::Deep hash, with each key paired to a 2 element anonymous array (one element for the present count, and one for the absent count). The script works fine with small numbers of entries. However with large numbers of entries (approx 50,000), after the counts reach around 3 or 4, they begin to reset to 1. I tried an alternative version in which I use 2 separate hashes (1 for present counts and 1 for absent counts) but it's too slow to use. Can anyone suggest why the original version's counts reset or an alternative way I could implement the code?

This is what it looks like:

#!/usr/bin/perl use DBM::Deep; use Getopt::Std; # Check for pre-existing output files and die if they exist if (-e "present"|| -e "absent") { die "Remove existing ouput files before running script!"; } # Define command syntax for output to screen in case of user error my $syntax = "\nCommand Syntax: \n\nihcrdb -i <input filename> -b <bac +kground>\n\n"; # Define hash for storage of command line arguments and define single- +letter switches to accept my %arghash = (); getopts("i:b:", \%arghash); # If all necessary arguments are not defined on command line, die with + error message and syntax output to screen unless (defined ($arghash{i} && $arghash{b})) { die "Insufficient commmand line arguments supplied! Quitting...\n +$syntax"; } # Define input file, output file and blast database = assign relevant +arghash values to them (my $input, my $background) = ($arghash{i}, $arghash{b}); # Define scalar variable to hold ref to Deep DB my $db = new DBM::Deep "CRDB"; # Get hash from DB my %pahash = %{$db->{hash}}; # Open input file or die open (INPUT, $input) or die "Cannot open infile!$!"; # Enter while loop for file parse while (<INPUT>) { # Skip header and Affy control lines next if (/^\s*$/) || (/^Gene/) || (/^AFFX/) || (/^2000/); # Split line on tabs, assign to array and chomp chomp (my @linearray = split "\t", $_); # Extract 3 required values my $name = shift @linearray; my $signal = shift @linearray; my $affycall = shift @linearray; # Increment Present count fot sequence if above bg and present else in +crement absent count if ($affycall eq "P" && $signal>$background) { $pahash{$name}->[0]++; } else { $pahash{$name}->[1]++; } } # Open present and absent output files open (PRESENT, ">present"); open (ABSENT, ">absent"); # Print sequence name and number of calls to output files. Output as p +resent if EVER called present foreach my $key (sort keys %pahash) { if (defined $pahash{$key}->[0]) { print PRESENT "$key\t$pahash{$key}->[0] present calls"; if (defined $pahash{$key}->[1]) { print PRESENT "\t$pahash{$key}->[1] absent calls\n"; } else {print PRESENT "\n";} } else { print ABSENT "$key\t($pahash{$key}->[1] absent calls)\n"; } } # Reassociate updated hash with stored DB $db->{hash} = \%pahash; # Close all files and exit close INFILE; close PRESENT; close ABSENT; exit;

In reply to DBM::Deep Problems/Alternatives by travisbickle34

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.