Hey thanks for the explanation. I have some more questions as I am dealing with huge datasets.

1. I would like to know accordingly to you is their any faster way (in terms of time) of reading the file (1,000,000 rows X 1,000,000) and making the comparison of the rows. Also, the matrix which we are reading has values as 1 or 2 except the id values. So, if I make the change in the value where I convert the matrix as the set of binary values then will that help in increasing the speed of the program ?

2.If I use some other data structure like hash of hashes, will that improve the computing and reading time of the file.

3. The summary of my code: I am trying to read the data line by line (using while loop), I am transposing the data and writing it into another file because I think comparing the file row wise will be better than column wise since, it will take more time. I am then opening my transposed file I am using your code to compute/compare the rows and then writing the results on another file. I am actually looking for the fast of doing all the operations as it is currently taking more than 10 minutes for the computation of (10,000 rows X 3000 columns)on my 8 GB RAM computer.

4. Is their any way to vectorize the code, where I am not comparing or taking the difference between the two rows simultaneously and not element by element like we have apply command in the R language.

5. If I use the comparison in terms for the first 100 elements in the row and then next hundred like from 1-100 and 2-101 etc, will that help.

Thanks a lot for the help.


In reply to Re^4: Extract the odd and even columns seperately in the hash of arrays or some other data structure apart from arrays by snape
in thread Extract the odd and even columns seperately in the hash of arrays or some other data structure apart from arrays by snape

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.