Expansion has asked for the wisdom of the Perl Monks concerning the following question:

I want to download some yahoo groups(files, photos, messages, memberlist) and i've found these scripts:
http://freshmeat.net/projects/grabyahoogroup/
http://sourceforge.net/project/showfiles.php?group_id=62034
(same scripts posted in two places)
I've downloaded activeperl and the needed modules from cpan(nothing fancy they're very easy to find), i've managed to install them, but when I run the script I get an error after it tells me that i've succesfully logged in:

"Use of uninitialized value $cells in pattern match (m//) at yahoogroups_files.pl line 244, line 2."

I'm guessing that yahoo changed the layout of the page or something, but I'm not able to update the script myself(I'm a newbie when it comes to perl and understanding the way yahoo generates the pages, i only know some basic C++). I want to mention that I'm not lazy, I'll try do fix it myself but I need your help. Hints, advices, anything.
PS: I've contacted the author, but he isn't willing to update the scripts.
Regards, Nick
  • Comment on PERL scripts-Yahoo Groups Download-i get an error

Replies are listed 'Best First'.
Re: PERL scripts-Yahoo Groups Download-i get an error
by Anonymous Monk on Mar 18, 2009 at 19:15 UTC
    Its a warning. splain:
    Use of uninitialized value $cells in pattern match (m//) at yahoogroups_files.pl line 244, line 2 (#1) (W uninitialized) An undefined value was used as if it were alread +y defined. It was interpreted as a "" or a 0, but maybe it was a mi +stake. To suppress this warning assign a defined value to your variables. To help you figure out what was undefined, perl will try to tell y +ou the name of the variable (if any) that was undefined. In some cases it + cannot do this, so it also tells you what operation you used the undefine +d value in. Note, however, that perl optimizes your program and the opera +tion displayed in the warning may not necessarily appear literally in y +our program. For example, "that $foo" is usually optimized into "that + " . $foo, and the warning will refer to the concatenation (.) operat +or, even though there is no . in your program.
      The actual part of the code that causes problems looks like this:
      my ($cells) = $content =~ /<!-- start content include -->\s+(.+?)\s+<! +-- end content include -->/s; while ($cells =~ /<tr>.+?<span class="title">\s+<a href="(.+?)">(.+?)< +\/a>\s+<\/span>.+?<\/tr>/sg)
      The $cells variable gets in the while loop unintialized.

      Why does this happen? The $content variable contains the HTML Yahoo login page(I've checked it by outputting its contents to a html file).

      Also, what does this mean?

      "/<!-- start content include -->\s+(.+?)\s+<!-- end content include -->/s"

      I can understand only simple m// patterns, so I can't figure what's the string that it searches for.

      Please help.

        $cells is unititialized because the "find stuff between these to comments" match is failing.

        The pattern's looking for the shortest string it can find between the start and end comments for the content include, throwing away any leading spaces. Here's how it breaks down:

        \s+ # one or more spaces - don't capture these ( # start capturing .+? # one or more anythings, shortest match to what follows ) # end capture
        Check the text of the page in $content; my guess is that the Groups folks have reformatted their pages, and the comments that this match is looking for have changed.