Wiggins has asked for the wisdom of the Perl Monks concerning the following question:

As I fumble along the path of Perl, I keep finding that my stride exceed my knowledge. Today it is 'tailing' files whose handles are in a hash.

I have long used the paradym of:

open INP, "<xyzzy"; while (! $fileRotated){ while(<INP>){ ... read lines } sleep 1; seek (INP,0,1); #clear EOF #check inodes and length }
That works fine for a single known file. But now the task is to 'tail' files in an arbitrary number of directories, all in the same tailing loop.

For this task, my open() function is in a readdir() loop, and the handles are being put into a hash keyed by the directory name.

while (readdir $dirH) { next if ($_ eq '.'|| $_ eq '..'); next if (! m/\@(?:\d{1,3}\.)\d{1,3}/); #ip of src of records if ( ! defined $Accts{$_}){ #new directory # open the new file and put handle in hash #open $Accts{$_}, "<$_/all/events.log"; open $Accts{$_}, "<$Cpath/$dateDir/$_/all/events.log"; } }
'$Accts{$_}' take place of INP. That seems to be running without visible problems. But when using these file handles in a read statement the results is not as expected.
foreach $key (keys %Accts){ # seek($teamAccts{$key}, 0, 1); # reset end-of-file error my $safekey=$key; $safekey =~ s/ /_/g; # no embedded spaces in tokens while (<$Accts{$key}> ) { # one of the sub files my $L=$_; # $_ by itself gave same result my $msgL="$DTG $safekey $L"; # $L should have \n already $rsltStr .= $msgL; } }
I expected to see lines of text ( timestamp dirName textline). The while(<FH>) should be returning lines of the files. Instead I see:
1400765377 msgfrom@10.0.1.2 GLOB(0x1f11bf8)
WTFO? "GLOB(...)"

-----Update-2-----

Found the problem! It now works!!

It was in the path value passed to the open. I did not fully construct the superior path (prefix) to the value returned from the readdir.

As a result I attempted to open an incomplete path and failed.

-----Update-1-----

I have tried the readline() approach on a file that has data present:

foreach $key (keys %teamAccts){ # plog "Key=$key"; seek($teamAccts{$key}, 0, 1); # reset end-of-file error my $safekey=$key; $safekey =~ s/ /_/g; # no embedded spaces in tokens #while (<$teamAccts{$key}> ) { # one of the sub files while ( readline($teamAccts{$key}) ){ my $L=$_; plog "readline=<$L>"; my $msgL="$DTG $safekey $L"; # $line should have \n already $eventStr .= $msgL; #addEvent ($key, $L); } }
Nothing in the while loop executes. 'plog' is a logging subroutine.

I suspected that the 'seek' might not be working to clear the EOF, so I used a file that has 10 lines of text. None were processed.

I Dumped the hash after opening the file with this result:

--scanning MSpt scanCS: opening <Chris - Kali 1@10.0.1.2> $VAR1 = 'Chris - Kali 1@10.0.1.2'; $VAR2 = \*{'::$teamAccts{...}'}; --finding +++finding MSpt Key=Chris - Kali 1@10.0.1.2 --scanning
$VAR2 is a reference, to a typeglob, of my hash entry??
Is this of any help?

It is always better to have seen your target for yourself, rather than depend upon someone else's description.

Replies are listed 'Best First'.
Re: FileHandles in a Hash (<> ambiguity)
by LanX (Saint) on May 28, 2014 at 16:07 UTC
    Without testing I think you fell into a special trap of '<>'...

    The <> operator has DWIM magic to act either as glob or readline.

    So it tries to use your hash like a list of filename patterns to be expanded like glob does.

    Like others indicated just use readline to avoid ambiguity or copy the hash element to a simple scalar.

    HTH! :)

    Cheers Rolf

    ( addicted to the Perl Programming Language)

    update

    see I/O Operators:

    If what's within the angle brackets is neither a filehandle nor a simple scalar variable containing a filehandle name, typeglob, or typeglob reference, it is interpreted as a filename pattern to be globbed, and either a list of filenames or the next filename in the list is returned, depending on context.
      For my own understanding I try to reformulate what is happening:

      When you do open $teamAccts{$key}.. the expression $teamAccts{$key} refers to a reference to a GLOB.

      When you put that into angles it should (from my point of view) be interpreted as a filehandle but for whatever reason it is not - rather it is treated as a glob-pattern that is first stringified (that gives the GLOB(...) value which is then "globbed". Because no file in the current directory matches that pattern, the pattern is returned.

      This behaviour of glob is probably to mimik the broken glob-behaviour of shells:

      perl -e print glob("hubba"); # prints "hubba" even if you don't have +a file "hubba"
      whereas:
      perl -e print glob("hubba*"); # prints nothing, provided no file matc +hes
      So this is why <$teamAccts{$key}> returns GLOB(...) For my taste this is not simply a wart but a hunchback with a wart on top of perl...

      You have tried to combine in a natural way two of perl's features (hashes and the angle-operator) and and were bitten by a total lack of orthogonality. Really nothing to boast about...

        Well Perl started as a script language, that's why it mimics glob.

        I rarely use <> for globbing, and I wouldn't mind if this feature would be disabled. And I agree that this DWIM logic is somehow surprising.

        But everybody is free to overload <> to do so.

        Your second critic about glob() is somehow cryptic for me, if you say touch hubba you don't want to touch "nothing" because hubba doesn't already exist.

        Perl's success in its early years came from its compatibility with established tools.

        Otherwise we wouldn't be here discussing Perl...

        Cheers Rolf

        ( addicted to the Perl Programming Language)

Re: FileHandles in a Hash
by poj (Abbot) on May 28, 2014 at 15:52 UTC
      What also works is
      while (my $line = $Accts{$key}->getline){ ...
Re: FileHandles in a Hash
by jellisii2 (Hermit) on May 28, 2014 at 15:42 UTC
Re: FileHandles in a Hash
by Laurent_R (Canon) on May 28, 2014 at 18:55 UTC
    Just for you to know another trap of hashes of file handles, if you want to write to a file handle in a hash, you usually have to enclose the hash element between curly braces. For example:
    print {FH_hash{$foo}} $line_out;
    I tried to see if enclosing the file handle within curlies for reading with the diamond <> operator might work:
    while (<{FH_hash{$foo}}>) { #...
    but that does not help. Using the readline function is really the right solution.
Re: FileHandles in a Hash
by locked_user sundialsvc4 (Abbot) on May 28, 2014 at 17:38 UTC

    I agree with the sentiment that you should File::Find the targets first.   Then, I think that you should open them, process them, and close them one-by-one.   Generally it is not a good idea to have too-many files open at one time:   you are quite likely to run into environment-specific limits and who knows what those limits might be.   So, it’s best simply not to go there.

    I suggest that you can simply push the names onto an ordinary array, in stage-one (finding files), then pop them off again in stage-two (processing).   And please note that I advise for these two steps to be cleanly separated.   In some OS environments, the act of opening a file can mess-up a search, regardless of the programming language that is used.   So, it’s best simply not to go there, either.