ag4ve has asked for the wisdom of the Perl Monks concerning the following question:

i'm trying to split values that i'm getting using Web::Scraper. However, i seem to be overwriting my hash? I'm not sure how this can be. Following is my code:

#!/usr/bin/perl use strict; use warnings; use Web::Scraper; use Data::Dumper::Simple; my( $infile ) = $ARGV[ 0 ] =~ m/^([\ A-Z0-9_.-]+)$/ig; my $page = scraper { process '//*/div[@id="Results"]/table/tr/td', 'table[]' => scraper +{ process '//span', 'name' => '@id', 'attr' => '@title'; }; process '//*//table[@id="Documents"]/tr', 'docs[]' => scraper { process '//tr', 'attr' => '@title'; }; }; open(FILE, "< $infile" ); my $content = do { local $/; <FILE> }; my $res = $page->scrape( $content ) or die "Can't define content to parser $!"; # print Dumper( $res ); my %values; for my $data ( @{$res->{ table } } ) { next unless $data->{ name } and $data->{ attr }; foreach my $line (split /\n/, $data->{ attr } ) { %values = split /:/, $line; } print "$data->{ name }\t $data->{ attr }\n\n" ; } print Dumper( %values ); #print $content;

the output that i'm looking to work with is:

Name: value Name2: value2 etc

also, this is only my second option. the author of the Web::Scraper package gives an example of adding functionality in his function, but i don't understand how to return the data. his example is:

my $scraper = scraper { process 'a[rel~="tag"]', 'tags[]' => sub { my $uri = URI->new($_->attr('href')); my $label = (grep length, split '/', $uri->path)[-1]; $label =~ s/\+/%20/g; uri_unescape($label); }; };

Replies are listed 'Best First'.
Re: loop split into a hash
by JavaFan (Canon) on Dec 12, 2010 at 00:16 UTC
    However, i seem to be overwriting my hash? I'm not sure how this can be.
    Here:
    %values = split /:/, $line;
    In each iteration, you're completely overwriting your hash. Perhaps you want something like:
    my ($key, $val) = split /:/, $line, 2; $values{$key} = $val;
Re: loop split into a hash
by BrowserUk (Patriarch) on Dec 12, 2010 at 00:43 UTC

    Replace %values = split /:/, $line; with:

    $values{ $_->[0] } = $_->[1] for [ split /:/, $line ];

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: loop split into a hash
by Anonyrnous Monk (Hermit) on Dec 12, 2010 at 00:41 UTC

    In case you want to collect the key-value pairs of all $lines in one hash, you could concat them to one long string and then do the split and hash initialisation once outside of your loops. Something like

    my $all_lines = ''; for my $data ( @{$res->{ table } } ) { next unless $data->{ name } and $data->{ attr }; foreach my $line (split /\n/, $data->{ attr } ) { $all_lines .= "$line:"; } print "$data->{ name }\t $data->{ attr }\n\n" ; } chop $all_lines; # remove trailing ":" my %values = split /:/, $all_lines;

    This isn't fancy, but gets the job done (and is faster, too, than splitting/assigning lines individually).