Apologies - I was trying to explain as clearly as possible, but have obviously failed! ;)
Anyway, this goes back to a previous writeup for finding microsatellites in genome files. Having solved the problem of finding them I now need to count them and sort them into categories, viz.
- Which genome file they were found in ($b).
- How long the repeating motif is, e.g. 3 units ($c).
- What the repeating motif is, e.g. ATT ($d).
- How many repeating motifs there are, e.g.(ATT)6 ($e).
There are various outputs I need, but the first would be a file for each genome showing each unique motif ($d) and how many of them were found for a particular length, e.g.
units|A|AT|GT|ATT|...etc.
11 |1|0 |2 |3 |...
12 |0|1 |1 |4 |...
This shows that I've found one case of an A that is 11 units long, no ATs of the same number of units, but two GTs of 11 units, etc.
Does this make any sense? | [reply] |
Are you all set now? Sounds like kvale's explanation of how to dereference your hashes and arrays in a loop was what you were looking for.
| [reply] |
Kvale's explanation was indeed the sort of thing I was looking for. I still find it a bit fiddly to make work, so I am also looking into formatting the output of Data::Dumper to give me the sort of appearance that I'm after.
| [reply] |