bpthatsme has asked for the wisdom of the Perl Monks concerning the following question:

Good Afternoon, evening, morning, and all other times of day Monks!

This is my first post here and I come to you with an issue regarding the use of a hash of arrays.

I have searched around significantly and used all of the debugging available in my arsenal to attempt to solve this issue, but remain stumped.

That said I come to you humbly hoping for answers. The below code is a portion of code written to tear apart csv formatted HTML (that is achieved successfully earlier in the script via regex). However I continue to receive errors regarding uninitialized values, beginning with the first @values line in the for loop. I know that this means that I have not defined the value, but I am also unsure on how to do this while populating an array. Any guidance will be vastly appreciated!

Best!

bp

PS- My apologies if this is a double post, I wanted to make sure the post was written under my username as opposed to an anonymous user!

my $stats = $htmlcontent; @rows = split(/\n/,$stats); #prints the cleaned up data after dumping it to an array #foreach (@rows) { # print "$_\n"; #} #Make sure 'good' data is pulled if ( $rows[0] !~ /Function Name/ ) { $np->nagios_exit("UNKNOWN", "Can't find csv header!\n"); exit $ERRORS{"UNKNOWN"} } #get number of rows after data cleanup $rowcount = scalar(grep {defined $_} @rows); #foreach (@values) { # print "$_\n"; #} #die; my @fields = (); @fields = split(/\,/,$rows[0]); @values = (); my %stats = (); for ( my $i = 1; $i <= $rowcount; $i++ ) { @values = split(/\,/,$rows[$i]); #print "this is row [$i] : $rows[$i]\n"; if ( !defined($stats{$values[0]}) ) { $stats{$values[0]} = {}; } if ( !defined($stats{$values[0]}{$values[1]}) ) { $stats{$values[0]}{$values[1]} = {}; } for ( my $x = 2,; $x < $#values; $x++ ) { # $stats{pxname}{svname}{valuename} $stats{$values[0]}{$values[1]}{$fields[$x]} = $values[ +$x]; } }

Replies are listed 'Best First'.
Re: Problem populating Hash (I think?)
by kcott (Archbishop) on Feb 09, 2012 at 02:23 UTC

    If you want to exclude undefined values:

    @values = grep { defined } split(/\,/,$rows[$i]);

    If you want to set undefined values to some default value (e.g. an empty string):

    @values = map { defined ? $_ : '' } split(/\,/,$rows[$i]);

    -- Ken

Re: Problem populating Hash (I think?)
by GrandFather (Saint) on Feb 09, 2012 at 02:47 UTC

    How about you mutate that into some stand alone code with a sample of the data you are trying to process so we can reproduce your issue?

    While you are doing that there are a number of areas where it would be to your advantage to make your code less like C and more like Perl and clean to up a few other style issues:

    • Use Perl for loops where you can: for my $row (@rows) { my @values = split /\,/, $row;
    • declare and initialise variables at the same time: my @fields = split /\,/, $rows[0];
    • Filter your array directly: @rows = grep {defined} @rows;.

    Actually the last point may be related to your undef issue. If some of the first $rowCount elements in @rows are undef you will get warnings. You will also get warnings if there are too few values in a row.

    True laziness is hard work
Re: Problem populating Hash (I think?)
by planetscape (Chancellor) on Feb 09, 2012 at 06:08 UTC

    Parsing an HTML file is actually a fairly difficult problem.

    There are many ways to go wrong and only a few ways to go right.

    What Marshall said. I myself have had good luck using HTML::TreeBuilder.

    HTH,

    planetscape
Re: Problem populating Hash (I think?)
by Marshall (Canon) on Feb 09, 2012 at 05:30 UTC
    Parsing an HTML file is actually a fairly difficult problem.
    There are many ways to go wrong and only a few ways to go right.

    I would suggest HTML Parser.

    Can you post a short example, showing the data, of what you are trying to accomplish?

    Update:
    Although Perl does allow a 'C' style 'for' loop:
    for (my $x=0; $x<@array; $x++){...} this is almost always the wrong idea, with the exception of numeric array processing.
    "almost always" doesn't mean "never", it just means that this type of loop is seldom used.

Re: Problem populating Hash (I think?)
by jwkrahn (Abbot) on Feb 09, 2012 at 07:00 UTC
    @rows = split(/\n/,$stats); ... $rowcount = scalar(grep {defined $_} @rows);

    There is no point in checking for undefined values in @rows because split will only return strings, not undef.

      I had some success utilizing NetWallah's suggestion:

      my (undef,undef,@fields) = split(/\,/,shift @rows); my %stats = (); for my $r(@rows){ my ($pxname,$svname,@values) = split /\,/,$r; for my $f(@fields){ $stats{$pxname}{$svname}{$f} = shift @values; } }

      I will explain a bit of the scope for those interested. This is designed to be a plugin that will rip apart a Proxy status page for use as a Nagios plugin.

      At this time I am working through the last bit of pulling various keys from the associative array in order to allow for CLI metric availability on demand. This is in conjunction with the standards utilized with Nagios' API.

      <p>Sample HTML to use is as follows:</p> <body> <p><h2>Profile</h2> <table cellspacing='10'> <th>Function Name</th><th>Calls</th><th>Total Time</th><th>Avg. Ti +me</th> <tr><td>some.function</td><td>0</td><td>0</td><td>0</td></tr> </table> </body> <p><h2>Cache Status</h2> 0 items cached, 0 cache hits, 0 cache misses (0.0%)

      I was able to clean this up to meet our purposes of a CSV style file utilizing the following series of regular expressions. This is not clean, but I am also new to perl, and the final product meets my requirements.

      $htmlcontent =~ s/ //g; #remove tabs $htmlcontent =~ s/<[^>]*>/,/g; #convert html tags into commas $htmlcontent =~ s/,,/,/g; #Convert double commas to single $htmlcontent =~ s/,\n,/\n/g; #remove leading and trailing commas aroun +d newlines $htmlcontent =~ s/,Function Name,Calls,Total Time,Avg\. Time/Function +Name,Calls,Total Time,Avg\. Time/g; #Clean up column headers $htmlcontent =~ s/,Cache Status,\n//; #nuke the cache status line $htmlcontent =~ s/Profile\n//; #nuke the profile line $htmlcontent =~ s/\n+/\n/g; #delete blank lines $htmlcontent =~ s/,\n//g; #nuke lines with a comma followed by a new l +ine $htmlcontent =~ s/^(?:.*\n){0,1}//; #remove first line $htmlcontent =~ s/,$//g; #nuke that pesky last line

      The code segment in question, in it's current state, appears as such:

      my $stats = $htmlcontent; @rows = split(/\n/,$stats); #Debug Line - prints the cleaned up data after dumping it to an array #foreach (@rows) { # print "ROW DATA : $_\n"; #} #Make sure 'good' data is pulled if ( $rows[0] !~ m/Function Name/ ) { $np->nagios_exit("UNKNOWN", "Can't find csv header!\n"); exit $ERRORS{"UNKNOWN"} } my (undef,undef,@fields) = split(/\,/,shift @rows); my %stats = (); for my $r(@rows){ my ($pxname,$svname,@values) = split /\,/,$r; for my $f(@fields){ $stats{$pxname}{$svname}{$f} = shift @values; } }

      At this time I am working through accessing the various function names in the associative array by assigning a variable to the 'Function' column, and then using a regex match in order to "search" for it. Any suggestions are of course welcome as I am still a perl newbie.

      Thanks again to everyone who chimed in and provided the fantastic suggestions provided here. I look forward to being around more often as I work through more plugins

      Best!

      bp

        Please note -- the top line of the sample HTML code was included in error, please disregard it if you are duplicating this locally.

Re: Problem populating Hash (I think?)
by NetWallah (Canon) on Feb 09, 2012 at 02:27 UTC
    Replace "my @fields" onwards with this code:
    my (undef,undef,@fields) = split(/\,/,shift @rows); my %stats = (); for my $r(@rows){ my ($pxname,$svname,@values) = split /\,/,$r; for my $f(@fields){ $stats{$pxname}{$svname}{$f} = shift @values; } }
    Note: This modifies both @rows, and @values - I'm assuming you don't need these after the info is extracted.

                “PHP is a minor evil perpetrated and created by incompetent amateurs, whereas Perl is a great and insidious evil perpetrated by skilled but perverted professionals.”
            ― Jon Ribbens