perl-diddler has asked for the wisdom of the Perl Monks concerning the following question:

I'm having a problem in the middle of one of my large progs that is in the midst of a refactoring...and would be difficult to extract into a separate test case, as it involves nested classes But want to show the output and see if anyone has seen symptoms like this. I'm pretty sure I know the program flow, and I dont' see how it can be doing what it is doing. Output
>Parseable::new, pckg= 'main{}' >URL::Fetchable::new ## >=entering function... >URL::Cacheable::new >URL::new, pkg='main' (SET) storage_path="cache" storage_path=cache <URL::new #finally starts to exit... host_mask="(?-ims:([a-z...", ##In Cachble:new..prints charset="UTF-8", ##parms being init'ed ## this next one is the prob: (code): P "pkg_host_mask=%s", $pkg_host_mask; (output): pkg_host_mask=(?-ims:([a-zA-Z][-a-zA-Z0-9]*\.[a-zA-Z]+)$) (code): $pkg_host_base = ($p->host =~ m{($pkg_host_mask)}) [0]; (output (death): Use of uninitialized value $URL::Cacheable::pkg_host_mask in regexp co +mpilation at ./crawl.pl line 385. at ./crawl.pl line 385. ### I print it out on the line before and it has the value shown -- a nice "regex"... then the next line a crash claiming it wasn't set?? HOW? FWIW, I also have printed out 'host' and it is it's expected sitename value. Rest of error is just unwinding the call stack which pretty much mirrors the ">" entrylines above. URL::Cacheable::host_base('main=HASH(0x1be4338)') called at ./ +crawl.pl line 397 URL::Cacheable::host_path('main=HASH(0x1be4338)') called at ./ +crawl.pl line 407 URL::Cacheable::save_path('main=HASH(0x1be4338)') called at ./ +crawl.pl line 440 URL::Cacheable::localized_link('main=HASH(0x1be4338)') called +at ./crawl.pl line 490 URL::Cacheable::new_URL_Cacheable('main=HASH(0x1be4338)', 'HAS +H(0x1be7b30)') called at ./crawl.pl line 666 URL::Fetchable::new_URL_Fetchable('main=HASH(0x1be4338)', 'HAS +H(0x1be7b30)') called at ./crawl.pl line 1235 Parseable::new_Parseable('main=HASH(0x1be4338)', 'HASH(0x1be7b +30)') called at ./crawl.pl line 1389
So, How could I be messing this up? I print it out immediately before. I've restructured routines around it (that lead into it). I even broke the setting of the host-base into it's own routine (was combined in another, "host_path", from which it is called. Arg, might be some help will toss in the few routines and commens so you can get an idea of what is going on :
# host_base set from 'host' when host_mask is set; Example: # if host_base==www.first.host & hostmask gets us "first.host" # i.e. hostmask would be only looking at the last 2 domainname-parts # in future matches, a host would first have the mask applied # (masking off a 3rd level sub-domain part), then match against # first_host .. resulting in a match if # that's subtracted from future relative saved paths. our ($pkg_host_mask, $pkg_host_base); our $items_from_cache=0; our $Cache_Control; sub host_mask (;$) { my $p = shift; return $pkg_host_mask if $pkg_host_mask; $pkg_host_mask = $_[0] if @_; $pkg_host_mask } sub host_base(;$) { my $p = shift; my $reset=1 if @_ && EhV shift, reset; return $pkg_host_base if $pkg_host_base && ! $reset; if (@_) { $pkg_host_base=$_[0]; } elsif ($p->host && $pkg_host_mask) { P "pkg_host_mask=%s, host=%s", $pkg_host_mask: $pkg_host_base = ($p->host =~ m{($pkg_host_mask)}) [0]; } return $pkg_host_base; } # returns empty-string if same site, or sitename+dir_sep if not; # caches result in {host_path} sub host_path { my $p = $_[0] or return undef; my ($host_base, $host_mask) = ($p->host_base, $p->host_mask); $host_base && $host_mask && $p->host or die P "Can't produce host_path w/o host_base( host(%s) & mask(%s) +)", $p->host, $host_mask. ( ($p->host =~ m{$host_mask/})[0] eq $host_base ) ? '' : $p->host; }
host_path is called from this level's "new" that calls $p->host_base, above where the error happens.

I have tossed in a ton of debugging... some new, some from when I first developed this almost 3 years ago...I knew I'd need a good debug structure up front, so that was designed in...

Ideas? Flamage? laughter? ( :-( *sigh*))... How could I print a value in 1 line and be null in the RE?

Replies are listed 'Best First'.
Re: strange prob--print RE& then use-says not set??
by kcott (Archbishop) on May 15, 2013 at 05:20 UTC

    G'day perl-diddler,

    Your problem may be related to which package is current. The error shows "$URL::Cacheable::pkg_host_mask" but your code only uses "$pkg_host_mask" and the current package is not indicated.

    What's "P()"? It looks like some sort of wrapper around printf but that's just a guess.

    Where does the following statement occur?

    $pkg_host_base = ($p->host =~ m{($pkg_host_mask)}) [0];

    Unrelated to your current problem, but you have a redundant level of parentheses in that code. I foresee a future SoPW like "Why is this matching twice?".

    $ perl -Mstrict -Mwarnings -E ' my $host = q{a.b.c}; my $re = qr{([a-zA-Z][-a-zA-Z0-9]*\.[a-zA-Z]+)$}; my $base = ($host =~ $re)[0]; # How I might have written it say $base; $base = ($host =~ m{($re)}) [0]; # How you wrote it say $base; $base = ($host =~ m{($re)}) [1]; # Fodder for another SoPW say $base; ' b.c b.c b.c

    The error has "at ./crawl.pl line 385". Have you posted that line? If so, which is it?

    That may be enough for you to resolve your issue. If not, please provide the requested information.

    -- Ken

      I can glean the package we are in from the tracing output in the first code-section where it is showing program output.

      First a func in main is called, then package URL::Fetchable's new then URL::Cacheable's new (they used to all be called new, but I didn't trust the right ones would get called so they pretty much all have per-package names, the debug output still says 'new'... as that is there essential function..)..Then it calls URL::new -- though the pointer passed to it was blessed in main (which is why it says package main there)...i.e. that's the package the object is being blessed into at this point).

      Each of those classes are derived from the one above it (except Parseable::new is part of main now). Once inside URL::new we see the setting of the storage path (a param passed down in the argument hash that is passed down), then we see URL::new exit. (the <URL::new), so right above it, in the call chain, is URL::Cacheable, where this 'action' takes place.

      There output from P (like printf, but does a bit more -- in CPAN), shows pkg_host...(I put the (code:)(output:) tags in the output -- also the ### are added as commentary. Then comes the code pkg_host_base = ($p->host =~ m{(..mask)}[0].

      Which I gather you mean that because I had the () inside the mask expression, I likely wouldn't want them on the outside as well.. yes.. you are right... I guess I'll just have to require that the host_mask include it's own set of parens...didn't want to rely on that (even though the end-audience may only end up being me...)..

      I fixed that and still get the error... though I would note that the double parens would have prevented a match because the $ was inside the outer set of parens.

      *Sigh* (this code used to download several objects before crashing -- (by causing a segfault in Perl), but in recoding/factoring... whole pile of code thrown up in the air, and comes down in different places...oi).

      Anyway, that's statement 385...where that nested parens was... but as you probably guessed, that didn't fix it.

      After that -- is just the unwind code from the error -- does a traceback from ... so you can see there as well that host_base was part of URL::Cacheable... then some unwinding in Cacheable where it was parsing it's params, then back out the top.

      So it is in package Cacheable... as to why it prints out the full varname-- I'm guessing it's because it is a package variable (declared with 'our')...

      I think I answered all your questions.. unfortunately, not much further along that was before 'cept for preventing a future SoPW... ;-)....so thanks for that... but my head is still hurting over the original problem...*owie*.

        Just a bit more fodder ... I changed sub host_mask to print it's value if it was already set and as it returns it. And instead of referencing the package var directly in host_path -- am going through sub-host_mask:
        sub host_mask (;$) { my $p = shift; if ($pkg_host_mask) { P "(hostmask pre-set to %s)", $pkg_host_mask; return $pkg_host_mask; } $pkg_host_mask = $_[0] if @_; $pkg_host_mask }
        Now the output seems to even more clearly show weirdness:
        >Parseable::new, pckg= 'main{}' >URL::Fetchable::new >URL::Cacheable::new >URL::new, pkg='main' (SET) storage_path="cache" storage_path=cache <URL::new (hostmask pre-set to (?-ims:([a-zA-Z][-a-zA-Z0-9]*\.[a-zA-Z]+)$)) (hostmask pre-set to (?-ims:([a-zA-Z][-a-zA-Z0-9]*\.[a-zA-Z]+)$)) host_mask="(?-ims:([a-z...", charset="UTF-8", (hostmask pre-set to (?-ims:([a-zA-Z][-a-zA-Z0-9]*\.[a-zA-Z]+)$)) (hostmask pre-set to (?-ims:([a-zA-Z][-a-zA-Z0-9]*\.[a-zA-Z]+)$)) $p->host_mask=(?-ims:([a-zA-Z][-a-zA-Z0-9]*\.[a-zA-Z]+)$), host= (hostmask pre-set to (?-ims:([a-zA-Z][-a-zA-Z0-9]*\.[a-zA-Z]+)$)) Use of uninitialized value $host_mask in regexp compilation at ./crawl +.pl line 400. at ./crawl.pl line 400. URL::Cacheable::host_path('main=HASH(0x168f338)') called at ./ +crawl.pl line 409 ...
        I mean how much more 'confidence' do I need to know host_mask really is set!!... yet still... the regexp compilation tosses it's cookies....!#$%~@#!*)&()+!
        Oops, sorry, my fingers slipped must have slipped..