in reply to Re^2: csv to hash table
in thread csv to hash table

Any way I can create hash table to start from starting at particular line in CSV file.

Read past the lines you want to skip before entering your while loop.

(Ps. If you can't work out how to do that for yourself, you might consider employing someone to write this code for you.)


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^4: csv to hash table
by waytoperl (Beadle) on Nov 19, 2013 at 09:10 UTC

    Update! got working code that reads csv file and creates hash table. Surprisingly, when i print and check hash size, script displays 13 and prints 13 keys and 13 values. I should have minimum 150 keys/values in hash table. Does anyone take a look and suggest any workarounds!!!

    Column_1,Column_2 Name11,In Name21,Out Name31,In Name41,In
    use Data::Dumper; my $infile = file.csv; sub mainCSV { open (my $infile1, '<', "$infile") or die "Unable to open $infile: + $!"; my %hash = (); while (my $line = <$infile1>) { chomp; $line =~ s/\s*\z//; my @array = split /,/, $line; my $key = shift @array; $hash{$key} = \@array; } my $size = scalar keys %hash; # explicit scalar context print "Hash size: $size\n"; # prints Hash size: xx print Dumper(\%hash); }

      Maybe you have duplicate keys?

      ... my $key= shift @array; if( $hash{ $key }) { warn "Duplicate key '$key'"; }; $hash{ $key }= \@array; ...

      We really need to see the output.   I see a potential problem here in that you are stashing “a reference to @array” ... while that is a single variable.   Hence, all of the references will be to the same block of storage, namely, @array.   In Fortran parlance, all of the references are EQUIVALENCEd.   They will all be seen to contain the last contents of @array, and a change to any one will be reflected in every other, because “you are actually looking at one block of storage, albeit through several mirrors.

      I think that you need to be sure that each hash-bucket contains a uniquehashref, and that the values get pushed onto that.   Perl’s “auto-vivification” feature comes in handy, with something like this:

      use strict; use warnings; use Data::Dumper; my @arry = ( "key", 1, 2, 3 ); my $hash; my $key = shift @arry; # NEXT STATEMENT 'AUTOMAGICALLY' CREATES A HASH-ENTRY FOR $key # AND CAUSES IT TO CONTAIN AN EMPTY ARRAY IF IT # DOES NOT ALREADY EXIST: push @{ $hash->{$key} }, $_ foreach @arry; print Data::Dumper->Dump([ \@arry, $hash], ["arry", "hash"] );
      gives ...
      $arry = [ 1, 2, 3 ]; $hash = { 'key' => [ 1, 2, 3 ] };

      The foreach clause is shorthand for an equivalent loop.   $_ contains the value within each iteration.   Notice how this loop is non-destructive to the content of @arry, iterating through its values without disturbing them.   The magic works now, because we are making copies of each value and pushing those onto a new arrayref (created on-demand) within the hash-bucket for $key.   Each hash-bucket, and each of the values therein, is distinct.

      In the statement-of-interest, @{ ... } is part but not all of the magic.   Here, we are telling Perl that the value within the hash-bucket should be interpreted as / initialized to an arrayref.   Perl will automatically create a hash-entry (of course) on demand, because that is what hashes do, but here we’re declaring its type and immediately using it.   We can “auto-vivify” hashrefs, too, so that a line of code something like this ... actually Just Works™:

      $hash->{"hickory"}{"dickory"}{"dock"} = "clock";
      gives...
      $hash = { 'hickory' => { 'dickory' => { 'dock' => 'clock' } } };
        while that is a single variable
        Not really, as the variable @array is declared with my inside the loop.
        لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

        Thanks for input. Basically hash will group keys with duplicate values. I replaced with new code by commenting-out warn message with duplicate keys. While in execution, there is error "Global symbol "$hash" requires explicit package name at ./file_name.pl line 69." I don't have hash_ref to hash, to my understanding, without defining hash_ref we can't refer to/and have data written in hash table. Please correct.

        sub mainCSV { # Open the CSV input file open (my $infile_CSV1, '<', "$infile_CSV") or die "Unable to open +$infile_CSV: $!\n"; my %hash = (); my $hash_ref = \%hash; while (my $line = <$infile_CSV1>) { chomp; $line =~ s/\s*\z//; my @array_CSV = split /,/, $line; my $key_CSV = shift @array_CSV; push @{ $hash->{$key_CSV} }, $_ foreach @array_CSV; ++++ Error + ++++ print Data::Dumper->Dump([ \@array_CSV, $hash], ["array_CSV", +"hash"] ); ++++ Error +++++ #if ($hash{$key_CSV}) #{ # warn "Duplicate key '$key_CSV'"; #}; #$hash{$key_CSV} = \@array_CSV; } # Explicit scalar context my $size = scalar keys %hash; # Open the output file and save hash in $outfile_RX_CSV #open (my $outfile2, '>', "$outfile_CSV") or die "Unable to open $ +outfile_CSV: $!\n"; #print $outfile2 Dumper(\%hash); #close $outfile2; #print "Stored $size list of pins in $outfile_CSV file.\n"; # Return a reference to the hash. #return %hash; }