in reply to Data munging
Code wrapped for clarity:
perl -anle"++$h{$F[0]}[0];$h{$F[0]}[1]+=$F[1]} {print qq[$_\t$h{$_}[0]\t],$h{$_}[1]/$h{$_}[0] for keys %h" munge.txt 4 3 42 1 3 195.333333333333 3 3 27.6666666666667 2 1 20
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Data munging
by umasuresh (Hermit) on Jan 22, 2010 at 17:01 UTC | |
Hi Monks, Thanks much for all the valuable suggestions. I have another challenge at hand and greatly appreciate any input. I have to compare a query text file against a reference text file and do the following: 1. match the common keys with multiple values representing fragments of each key. 2. for each key I compare the start and end columns of each fragment (column1 & column2). 3. for each key if the fragments overlap between the query and reference keep count of the overlaps. 4. report the query sequence and insert the overlap count in column3, retaining all the other columns. Note: The reference and query files are really large running >= 300,000 lines. This post is really long, so please bear with me! My questions are below: 1. I feel that the code can be much simpler than what I have! 2. Again memory issue is a concern for large files! Here are the reference and query text files:
| [reply] [d/l] |
by BrowserUk (Patriarch) on Jan 22, 2010 at 18:19 UTC | |
Why is the code you've posted so completely broken? Did you think that you could fool us by sticking use strict; use warnings; at the top and declaring some of the variables with my, and get us to solve your problems that you've failed to solve? Tip: If you strict and warnings from the get go, you'd find it far easier to get your code right as you go, rather than painting yourself into a corner of undeclared and uninitialised variables. I have a 25 line solution to what I interpret from your description to be correct. It produces this output from your samples:
But I cannot check whether that is correct because your code doesn't produce anything for me to compare it against :( Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
by umasuresh (Hermit) on Jan 22, 2010 at 20:41 UTC | |
| [reply] [d/l] |
by BrowserUk (Patriarch) on Jan 23, 2010 at 01:50 UTC | |
by umasuresh (Hermit) on Jan 25, 2010 at 18:46 UTC | |
| |
by umasuresh (Hermit) on Jan 22, 2010 at 19:49 UTC | |
Your comment about me fooling is unacceptable to me. Please turn off the use strict at the beginning and check if you get an answer! Everyone can make mistakes when copying and pasting even if we check before pasting! | [reply] [d/l] |