It's over 10 min so I kill the process. I think the reason is it's writing data line as a hash name and the data line can have 300.000 characters. I changed it so it only reads the index of the ID and then will just add 1 to it when I need the data. Then it's done in couple of seconds. Thanks for the tips with the index. | [reply] |
I think the reason is it's writing data line as a hash name and the data line can have 300.000 characters.
No, it's not.
At least, if your description of the file is accurate it isn't.
This bit of the code: $Library_Index{<$Library>} = tell(ARGV), reads the IDs and constructs the hash.
And this bit: scalar <$Library> reads and discards the long data lines.
However, Now I think I see the problem with your version of the code.
This bit:until eof(); of the line iterates until the file is read, except that you forgot to put the filehandle $Library in the parens, so the program will never end because it is testing the end-of-file condition of a different file which will never be true.
Change the line to:
$Library_Index{<$Library>} = tell(ARGV), scalar <$Library> until eof($
+Library);
And see how long it takes.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
Ups, replied in a wrong place.
I tried the new code and it works really fast. Problem is there is an error with tell and it's all -1. Would be nice if I could just have ENST04000413399 as and ID but it does not mater that much. From Dumper
$VAR32564 = -1;
$VAR32565 = '>ENST04000413399
';
| [reply] [d/l] |
I tried the new code and it works really fast. Problem is there is an error with tell and it's all -1. From Dumper
$VAR32564 = -1;
$VAR32565 = '>ENST00400413799
';
| [reply] [d/l] |