snape has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I have some doubts related to file operations and hash of arrays which are as follows:

1. I would like to know that how will I do the translation/ substitution operation on the data stored in Hash of Array data structures.

2. If I have a file 4 columns where the first column act as the id (which can be repeated) and the other three columns are the integer values.

DATA: id col1 col2 col3 12 100 12 196 12 120 15 190 13 90 190 200 13 70 20 20 13 101 340 25 14 100 123 19 15 80 389 39

I have read the entire data in a file into Hash of Array. I would like to know how can I do the sum of the rows without taking the the values of the same id present in other columns of the file. My output should look like :

Output: id Sum 12 308 12 325 13 480 13 110 13 466 14 242 15 508

3. Is their any faster option to delete any column of the file without reading it line by line and then deleting the column using splice function or shift (for deleting the first column) etc.

Replies are listed 'Best First'.
Re: Hash of Arrays and File Operations
by toolic (Bishop) on Jan 28, 2010 at 16:22 UTC
    I would like to know how can I do the sum of the rows without taking the the values of the same id present in other columns of the file.
    use strict; use warnings; use List::Util qw(sum); my %data; while (<DATA>) { next if /^id/; my ($id, @cols) = split; push @{ $data{$id} }, \@cols; } for my $id (sort keys %data) { for my $aref (@{ $data{$id} }) { print "$id ", sum(@{$aref}), "\n"; } } __DATA__ id col1 col2 col3 12 100 12 196 12 120 15 190 13 90 190 200 13 70 20 20 13 101 340 25 14 100 123 19 15 80 389 39
    prints:
    12 308 12 325 13 480 13 110 13 466 14 242 15 508

    Update:

    1. I would like to know that how will I do the translation/ substitution operation on the data stored in Hash of Array data structures.
    In general, you will loop through your data structure using for, but, depending on exactly what your need to modify, you might be able to use map. Of course, if possible, it would be best to modify before you load it into your data structure in the first place.

      I am trying to use map but it doesn't seem to be working. I tried doing it as :

      @cols[1 .. $#cols] = map(tr /[1,2]/[0,1]/, @cols)

      Also, do u think doing (i)split and getting the values as $id and @cols; then doing (ii)join to get the cols as scalar values so that we can do translate and then do (iii) split again to get the array value so that I can store it in the hash of arrays a good idea ?? for eg:

      my %data; while (<DATA>) { next if /^id/; my ($id, @cols) = split; ## first split my $cols = join('',@cols); ## Join @cols = split('',$cols); ## and then split push @{ $data{$id} }, \@cols; }

      I am having second thoughts about this method as it may lead to slower process or it might not be a good programming practice. Please Advise.

        @cols[1 .. $#cols] = map(tr /[1,2]/[0,1]/, @cols)
        I'm no tr expert, but isn't tr /[1,2]/[0,1]/ more simply written as tr/12/01/?

        The following will replace all 1's with 0's, and all 2's with 1's, in all elements of the @cols array. It modifies the array in-place.

        map { tr/12/01/ } @cols;

        Update: I concur with AnonyMonk's hint that it is better written as:

        tr/12/01/ for @cols;
Re: Hash of Arrays and File Operations
by umasuresh (Hermit) on Jan 28, 2010 at 16:19 UTC
    I can try the first question.
    1. I would like to know that how will I do the translation/ substitution operation on the data stored in Hash of Array data structures.
    Follow this link for using tr/// ?node_id=820042
    Thanks marto for pointing the shortcuts link: Node: configurable tr///?