in reply to error joining hashref of an arrayref

I'm not quite sure what is going on here. Showing a bit more code would be helpful as to the overall objective.
foreach my $string ( @word ) { if ( $uword{ $string }[ 0 ] == 1 ) { push @{ $uword{ $string } }, $line; next; } $uword{ $string }[ 0 ] = 1; push @{ $uword{ $string } }, $line; } #this [0] is apparently not needed #maybe I'm missing something, but what do you think #that this does? #simplify to just... foreach my $string (@word) { push ( @{$uword{$string}}, $line ); }
Now,
@{ $uword{ $seen{ $key }[ 1 ] } # there is no closing curly for the @ # its rather strange looking but # { $seen{ $key }[ 1 ] } is the key for the hash
You've got some pretty complex syntax and maybe if you explained a bit more, some major simplifications could be made.

You are adding an unnecessary complication with this [0] =1 or not stuff. It is not necessary.

push @{$hash{$key}}, $value;
is all you need. If $hash{$key} doesn't exist prior to the push, the key will be created automatically.

Replies are listed 'Best First'.
Re^2: error joining hashref of an arrayref
by ag4ve (Monk) on Oct 30, 2010 at 16:37 UTC

    yes, i really do need to do something about this code one way or the other. it looks quite bad which means that i'm never going to want to maintain it.

    here's what i have for input

    the name of something something else cool the title given does not exactly match

    and for output, i want something like this

    something,something very nice,the name of something,something else coo +l

    where the cells are:search word,column returned from db,line(s) of input the word came from a search word may return more than one line in sql and that's fine, i'd like my csv to have the same date data for col 1, 3, and 4 and have the different result. i don't need duplicate data however, which was the point of the loops in question.

    below is my code. i've left in my original comments and added some:

    #!/usr/bin/perl ##### WHAT # # What names return per search ########## use strict; #use warnings; use DBI; my $searcher = "owner.manager, owner.owner, owner.manown"; my $dbh = DBI->connect('DBI:mysql:db;host=localhost', 'user', 'pass') or die "Database connection: $!"; open( FILE, "< $ARGV[0]" ); my %uword = (); my %seen = (); my $count = 0; my @data; my $key; my $i = 0; while ( <FILE> ) { my $line = $_; chomp ($line); my @word = split / /, $line; $count = 0; while ( $word[ $count ] ) { $word[ $count ] =~ tr/^[\-a-zA-Z]//; $word[ $count ] =~ s/\'/\\\'/g; $count++; } # deduplicate each word, but if it is a duplicate, # i still want to know what lines the word was on. foreach my $string ( @word ) { if ( $uword{ $string }[ 0 ] == 1 ) { push @{ $uword{ $string } }, $line; next; } $uword{ $string }[ 0 ] = 1; push @{ $uword{ $string } }, $line; } } # for every unique word, do a search... for my $key ( keys %uword ) { my ( $imo, $owner, $manown, $manager ); my $select = qq/SELECT $searcher /; my $from = qq/FROM owner, spd /; my $where = qq/WHERE MATCH( $searcher ) AGAINST('+ +$key' IN BOOLEAN MODE) /; my $join = qq/AND owner.number = spd.number/; my $query = $select . $from . $where . $join; print "SQL: $query\n"; my $sth = $dbh->prepare( $query ); $sth->execute; $sth->bind_columns( \$manager, \$owner, \$manown ); # since i don't know or care what field my matches came # from, take the results and put them into an array. while ( $sth->fetch ) { if ( defined( $owner ) ) { $data[$count] = $owner; $count++; } if ( defined( $manown ) ) { $data[$count] = $manown; $count++; } if ( defined( $manager ) ) { $data[$count] = $manager; $count++ } } # same general deduplication algorithm as before. # @data holds full names of data. foreach my $string ( @data ) { # dedupe data and sanity chec +k. next if !defined ($string); # should never be true. next if $seen{ $string }[ 0 ] == 1; # check %seen hash + / array for dupe $seen{ $string }[ 0 ] = 1; # define hash of array a +nd assign check to it. $seen{ $string }[ 1 ] = $key; # add word from line to +hash for reference. } } $dbh->disconnect; # this is here because i was up real late thinking # "wtf does what here".... # $key - (below) is the deduped sql string # $seen { $key }[ 1 ] - is the word searched in sql # $uword { '$seen{ $key }[ 1 ]' } - should be the line(s) of test the + search string came from for my $key ( keys %seen ) { print "$seen{ $key }[ 1 ], $key,"; print join(',', @{ $uword{ $seen{ $key }[ 1 ] }[ 1 .. $#{ @uwo +rd{ $seen{ $key } } } ] } ); print "\n"; }

      I see other problems in your code:

      • You don't use placeholders in the SQL statement.
      • You don't use prepare_cached().
      • You try to implement your own quoting.
      • The quoting code can be circumvented (just make $word[$count] false, e.g. by inserting " 0" into a line in the input file), which leads to a SQL injection vulnerability (all words following the 0 are not modified at all). Don't loop over the first few elements that are true, loop over all elements (foreach (@word) instead of while ($word[$count])).
      • You don't reset $count to 0 before the for loop, it still contains the number of non-false "words" at the beginning of the last line read. So you start filling @data at some unknown offset, and all elements of @data up to that offset are undef.
        next if !defined ($string); # should never be true.
        even shows that you are partly aware of that problem. Change next to die "should not happen" and see your script die.
      • You mess with $count and arrays where a simple push would be sufficient. Don't make your life harder then necessary by trying to write C in Perl. Get rid of $count in the for loop, use push @data instead.

      See also Re^2: Massive Memory Leak and Re: Counting rows Sqlite for the first three points.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

        your first three points on security are absolutely correct. however, after reading on prepare_cached(), i don't think that's my solution. maybe i'm wrong, but it just doesn't seem much more secure to me than prepare(). and i think that doing further checking on my input and a sanity check on the final query, that i be much better. in fact, since you brought it up, i found this great node here that has tons of pointers:

        http://www.perlmonks.org/?node_id=756058

        and this discussion on sql injection

        http://unixwiz.net/techtips/sql-injection.html

        however, i think the first thing i'm going to be reading is the 'perlsec' doc mentioned at the top of that pm node. as for my code, i finally figured out how to make it more readable and usable.

        since i'm wanting items based on each word (which lines a word is in and which strings from sql a word returns) i use %data{ $word } and put all associated data on that.

        it's late, so this has issues that i need fixing, but my general idea is:

        while ( my $fields = $sth->fetchrow_arrayref ) { foreach my $field (@fields) { definition( $line, $word, $field ); } } } $dbh->disconnect; for my $word ( keys %data ) { while( my ($field, $type) = each %{ $data }{ $word } ) { print "$word,$field" if( $type eq 'field' ); while( my ($line, $type) = each %{ $data }{ $word } ) +{ print ",$line" if( $type eq 'line' ); } } } sub definition { my ($line, $word, $field) = @_; if( defined( $field ) { if( !defined( $data{ $word }{ $field } ) ) { $data{ $word }{ $field} = field; } $data{ $word }{ $line } = line; } }
      Still not sure what you are doing here. But I tried to make some simplifications in the parsing part. I didn't really look in any detail at the DB part. I think you just need a HoA (Hash of Array). Each word would have a list of line numbers that it appeared on. A "unique" word would have just one line number in its array. Past that I'm not sure what this app is trying to do.
      #!/usr/bin/perl use strict; use warnings; use DBI; my $searcher = "owner.manager, owner.owner, owner.manown"; my $dbh = DBI->connect('DBI:mysql:db;host=localhost', 'user', 'pass') or die "Database connection: $!"; open( FILE, '<', $ARGV[0] ) || die "unable to open $ARGV[0]!"; my $line_num =0; my %words = (); #hash of array, list of line numbers per #each word while ( my $line = <FILE> ) { $line_num++; foreach my $word (split (/\s+/, $line)) { $word =~ tr/^[a-zA-Z-]//; $word =~ s|(\'.*)\b||g; #maybe ?? not sure: what's->what? push (@{$words{$word}}, $line_num); } } # for every word, do a search... # a "unique word" would be one who's array of line numbers has # exactly 1 number # I don't think that is what you wanted? # I think what was meant is do one search per each word? foreach my $word (%words) { # @{words{$word}} is array of line numbers that word appeared # upon # line number appears twice if word appeared twice on that # line ... do something in DB with $word.... } # $uword { '$seen{ $word }[ 1 ]' } - should be the line(s) of test th +e search string came from #no, that would be @{words{$word}}