I have a file that looks like this:
>34513
-------------------------------MVAIIFDMDGVLYRG
-----N-RAIPGVRELIEF-------LKE-R--------G------
>22476
------------------------------ALKAVLVDLNGTLHI-
--------AVPGAQEALKR---------------------------
>56832
------MARCERLRGA-----ALRDVLG--RAQGVLFDCDGVLWNG-
----E-RAVPGAPELLER-------LAR-------------------
>12543
---------------------------E--QFDILLLDLDGVVYVG-
----D-RLLPGARRALRR----------------------------G
>29078
---------------------------------AVLFDIDGVLVLS-
----W-RAIPGAAETVRQ-------LTH-R--------G--------
For now, I'm just interested in the 'headers' (that is, the line starting with '>'). I would like to place each of these in a hash and increment a count.
So, for example, the header '>34513' would have a count value of '1', the header '>12543' count value 4 and so forth.
This is what I've done so far.
#!/usr/local/bin/perl
use strict;
use English;
use Data::Dumper;
use UNIVERSAL qw(isa);
use FileHandle;
use Exception;
my $alignment = shift;
if (!$alignment || ! -e $alignment) {
die new Exception("couldnt open names file $alignment $!");
}
warn "# Reading alignment data";
my $alignData = getAlignData($alignment);
warn "# Got data: ".scalar (keys %$alignData);
#################################################
sub getAlignData {
my ($fIn) = @ARG;
my $fh = new FileHandle($fIn)
or die "";
my $count = 0;
my $hData = {};
while (my $line = $fh->getline)
{
my @cols = split /\s+/, $line;
# search only for lines with identifier
my $field = $cols[0];
my $test = substr($field, 0, 1);
if("$test" eq ">")
{
$count++;
my $hEntry = {
'identifier' => $cols[0],
'line' => $count,
};
my ($record) = sort ($hEntry->{identifier});
$hData->{$record} = $hEntry;
}
}
foreach my $k ( keys %{$hData} )
{
printf "%s -> %s\n", $k, $hData->{$k};
}
return $hData;
}
However, when I try to print out the hash I get the following.
>34513 -> HASH(0x87a3a40)
>22476 -> HASH(0x87a3980)
>56832 -> HASH(0x8762380)
>12543 -> HASH(0x87a3940)
>29078 -> HASH(0x8892b30)
Can anyone please tell me what I may be doing wrong? Thanks in advance.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.