canguro has asked for the wisdom of the Perl Monks concerning the following question:

Hi, Everyone,

I'm trying to learn Perl on my own and making some progress, however, I'm still a newbie. I figure I can speed up my learning process by trying out some simple programs. I have a pet project that attempts to retrieve some stock data info and strip the tags from the HTML. (By the way, this is for my own practice, not for any commercial use of any kind). I am using the following code but getting the 'Use of uninitialized' message shown above. I am probably missing something rather obvious. Can someone straighten me out? Here is the code:

#!/usr/bin/perl use warnings; use strict; use LWP::Simple; my $site="http://chart.yahoo.com/d?a=3&b=1&c=2003&d=3&e=11&f=2003&g=d& +s=crio"; my $content=get $site; my $message=$content; $message=s/<[>]*>//g; print $message;

Your help is very much appreciated. Many thanks.

2003-04-22 edit ybiC: <code> tags

Replies are listed 'Best First'.
Re: Use of uninitialized value in substitution (s///)
by Limbic~Region (Chancellor) on Apr 22, 2003 at 23:33 UTC
    canguro,
    Welcome to the monastery! I think you want a negated character class which is:
    $message=s/<[^>]*>//g; And the rest could be done like this untested code: #!/usr/bin/perl use warnings; use strict; use LWP::Simple; my $site = "http://chart.yahoo.com/d?a=3&b=1&c=2003&d=3&e=11&f=2003&g= +d&s=crio"; my $message = get("$site"); if ($message) { $message =~ s/<[^>]*>//g; print $message; } else { print "Error, get did not work\n"; }
    But you really shouldn't be trying to parse HTML on your own unless this is truly just practice to sharpen your Perl skills.

    Check out The CPAN for a myriad of modules concerning HTML and parsing. You are also going to want to use < CODE> < /CODE> tags around your code for formatting purposes.

    Cheers - L~R

Re: Use of uninitialized value in substitution (s///)
by Abigail-II (Bishop) on Apr 22, 2003 at 23:52 UTC
    The line
    $message = s/<[^>]*>//g;

    does a substitution on $_, and assigns the result to $message. But $_ is undefined, so you get the error. What you want is =~ instead of =.

    But this a pretty naive, and wrong, way of deleting HTML tags. Some cases where your program goes wrong:

    <IMG SRC = "foo.gif" ALT = "A > B"> <!-- <BR> --> A < B or B > A

    Abigail

Re: Use of uninitialized value in substitution (s///)
by Enlil (Parson) on Apr 22, 2003 at 23:51 UTC
    The reason for the error is as Mad Hatter stated:
    $message = s/<[>]*>//g;
    The reason is that the above code is short for the following:
    $message = $_ =~ s/<[>]*//g;
    and you do not have anything in $_

    That said, I would follow Limbic~Region's advice and check out CPAN modules for parsing out HTML.

    update: There were my's in front of $message that were dropped. Also that as Limbic~Region already mentioned that you probably wanted the negated character class.

    -enlil

Re: Use of uninitialized value in substitution (s///)
by The Mad Hatter (Priest) on Apr 22, 2003 at 23:40 UTC
    I don't know if this is causing the error, but in your second to last line, that equals should be =~.
Re: Use of uninitialized value in substitution (s///)
by vek (Prior) on Apr 23, 2003 at 03:37 UTC
Re: Use of uninitialized value in substitution (s///)
by canguro (Novice) on Apr 23, 2003 at 19:34 UTC
    To all you Monks who replied,

    Thank you so much for your incisive and interesting ideas. They were all very helpful, and hopefully well-absorbed into my feeble brain. I am truly impressed by the level of technical knowledge in this forum!

    Again, thanks!