in reply to Extract the odd and even columns seperately in the hash of arrays or some other data structure apart from arrays

You will get good advice if you provide more information, such as: See also perldsc.

Update: here is some example code to read the whole file into a data structure (HoHoA):

use warnings; use strict; use Data::Dumper; my %data; while (<DATA>) { my ($id, @cols) = split; for my $i (0 .. $#cols) { my $type = ($i % 2) ? 'odd' : 'even'; push @{ $data{$id}{$type} } , $cols[$i]; } } print Dumper(\%data); __DATA__ a 1 2 3 4 5 6 b 9 8 7 6 5 4
prints:
$VAR1 = { 'a' => { 'even' => [ '1', '3', '5' ], 'odd' => [ '2', '4', '6' ] }, 'b' => { 'even' => [ '9', '7', '5' ], 'odd' => [ '8', '6', '4' ] } };
  • Comment on Re: Extract the odd and even columns seperately in the hash of arrays or some other data structure apart from arrays
  • Select or Download Code

Replies are listed 'Best First'.
Re^2: Extract the odd and even rows seperately in the hash of arrays or some other data structure apart from arrays
by snape (Pilgrim) on Jan 24, 2010 at 03:48 UTC

    Answers to ur questions are as follows:

    a. How many even columns do you have?

    b. How many odd columns do you have?

    c. Are you just comparing items on the same line? If so, there may be no need to read all lines of your large input file into memory at once.

    d. Or, do you need to compare an item on one line to items on all other lines?

    e. What are you using the 1st column for?

    Ans a-e: There are more than 1 million rows and columns. The first row is about the entity names and therefore, I want to remove that. Then, I want to compare the entire row with all the other rows and would like to know where I have same match. I am also not reading the entire file at once. I am using file handler and while loop for reading it one per line.

    f. Are you comparing strings or numbers? Ans f: I am comparing the number or taking the difference between the two.

    g.Show a small sample of your input (fewer than 10 lines, fewer than 10 columns). Creating a small sample for yourself will make it easier for you to debug your own code, so the extra effort should pay off.

    Iid A1 A2 A3 A4 A5 A6 A7 A8 12 1 2 1 2 1 1 1 1 12 2 1 2 2 1 1 1 1 15 2 1 2 2 1 1 1 1 15 2 1 2 1 1 2 1 1 16 2 1 2 1 1 2 1 1 16 2 1 2 2 1 1 1 1 19 2 1 2 1 1 2 1 1 19 1 2 1 2 1 1 1 1 116 1 2 2 2 1 1 1 1 116 2 1 2 1 1 2 1 1

    You will see that the 2 rows have same name, so I would like to take see the match between the rows at same column position i.e. first row of 12 with both the rows of 15, 16, 19 and 116. Similarly second column of 12 with both the columns of 15, 16, 19 and 116. Thanks a lot for sharing your views.

      Please give us an "expected output".
        Iid A1 A2 A3 A4 A5 A6 A7 A8 12 1 2 1 2 1 1 1 1 12 2 1 2 2 1 1 1 1 15 2 1 2 2 1 1 1 1 15 2 1 2 1 1 2 1 1 16 2 1 2 1 1 2 1 1 16 2 1 2 2 1 1 1 1 19 2 1 2 1 1 2 1 1 19 1 2 1 2 1 1 1 1 116 1 2 2 2 1 1 1 1 116 2 1 2 1 1 2 1 1

        The Expected output for the above data will be as follows:

        Match between first row of 12 and first row of 15 is A5 to A8

        Match between first row of 12 and second row of 15 is at A5 and A7 to A8

        Similarly between all the rows except the row having same Id number.