in reply to reading from a file after a seek isn't working for me

I changed

read STDOUT, $_, 8192;

to

my $rv = read STDOUT, $_, 8192; die "read: $!\n" if !defined($rv);

Got

before=0 after=0 read: Bad file descriptor

There are problems with opening STDOUT without closing it first.

I changed

open(STDOUT, '+>', "/tmp/stdout.log") or die $!;

to

close(STDOUT); open(STDOUT, '+>', "/tmp/stdout.log") or die $!;

Got

before=0 after=0 stdout=hello world at end=12

Replies are listed 'Best First'.
Re^2: reading from a file after a seek isn't working for me
by jakobi (Pilgrim) on Oct 21, 2009 at 20:40 UTC
    @ikegami: Fellow Monks, can you please explain in detail the need for the explicit close here?

    Normally opening with an existing FH closes the original file or at least I never noticed a problem in cutting this corner in one-shots, one-liners or inline shell scripts (but usually avoiding read, sysread, tty's and STDIN/OUT/ERR).

    Thanx,
    Peter

    Update: - ok, any takers for this riddle with more time? Will summarize if pointed correctly with keywords and RTFM's to check :)

    From perldoc -f close:

    You don’t have to close FILEHANDLE if you are immediately going to do another "open" on it, because "open" will close it for you. (See "open".) However, an explicit "close" on an input file resets the line counter ($.), while the implicit close done by "open" does not.

    There are a few more notes on pipes, but those don't seem to match the opener's situation either. Skimming perlopentut I didn't see pointers of interest - au contraire, it even _seems_ to imply that reopening w/o close (my reading on the lack of close() in the Playing with STDIN/STDOUT section) for STDIN/STDOUT is fine. Or is there indeed some hardcoded magic of it being STDOUT we insist to read from??

    What do I miss?

      can you please explain in detail the need for the explicit close here?

      I think it has to do with PerlIO in combination with an implementation peculiarity.

      When you compare the straces of both variants, you'll see something like:

      # with explicit close close(1) = 0 open("/tmp/stdout.log", O_RDWR|O_CREAT|O_TRUNC, 0666) = 1 ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff2ba69a30) = -1 ENOTTY (I +nappropriate ioctl for device) lseek(1, 0, SEEK_CUR) = 0 fstat(1, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 fcntl(1, F_SETFD, 0) = 0 # without explicit close open("/tmp/stdout.log", O_RDWR|O_CREAT|O_TRUNC, 0666) = 4 ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff023c1390) = -1 ENOTTY (I +nappropriate ioctl for device) lseek(4, 0, SEEK_CUR) = 0 fstat(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 dup2(4, 1) = 1 close(4) = 0 fcntl(1, F_SETFD, 0) = 0

      Now, the issue is (I think) that although the dup2 does create a copy of fd 4 as fd 1 at the system level (and in fact does also close the old fd 1), it does not copy the PerlIO part, which is only being handled properly, when the filehandle is being created directly using Perl's open.  For this reason, the filedescriptor is considered invalid from the PerlIO point of view (—> the "Bad file descriptor" message). This is checked at the beginning of Perl's read using PerlIOValid(f)1 (even before doing any read system call).

      Don't ask (me), however, why the indirect dup2-technique is being used in the first place instead of simply closing the filedescriptor before the open...  (Presumably, it did work before the the introduction of PerlIO, and might just not have been adapted appropriately since.)

      ___

      1  see perlio.c:

      #define Perl_PerlIO_or_Base(f, callback, base, failure, args) \ if (PerlIOValid(f)) { \ const PerlIO_funcs * const tab = PerlIOBase(f)->tab;\ if (tab && tab->callback) \ return (*tab->callback) args; \ else \ return PerlIOBase_ ## base args; \ } \ else \ SETERRNO(EBADF, SS_IVCHAN); \ return failure ... SSize_t Perl_PerlIO_read(pTHX_ PerlIO *f, void *vbuf, Size_t count) { Perl_PerlIO_or_Base(f, Read, read, -1, (aTHX_ f, vbuf, count)); }

        Thanx for the pointer, almut. That dup & perlio scrap is interesting.

        But there must be more to it than that, as I don't see any special treatment for STDOUT in the perlio.c scrap (neither for the numeric FD's 0 to 2):

        I was playing with the scrap below in the meantime.

        I dupped SAVOUT on STDERR instead / simplifying system to printing / using autoflush / opening STDOUT myself to /dev/tty first: no change.

        This however is interesting:

        Changing the name of the handle STDOUT <=> ANYTHINGeLSE manages to act as a toggle for the problem. Furthermore, w/o close, the tell on the STDOUT file pointer at begin prints 19 in the example below (might be due to the handle earlier being a tty, and something didn't quite catch the change to a plain file w/o explicit close?). Any other handle name prints 5 regardless of close or no close.

        So it looks like we have some hard-coded STDOUT-related magic somewhere in the guts of PERLIO or even lower, with probably STDIN/ERR offering similar peculiarities.

        Given that too much in Perl, esp wrt <> and stdio is magic, it's probably a good idea to say strictly outside any possibly dusty corner whose smell is faintly related to something magic. Which in this case might just be the idea of reusing a special handle, and worse, reading from it.

        How to classify this behaviour: What doc/code do we still miss? Or is this indeed, say, an easy-to-fix oversight in the documentation? Or is it a somewhat larger actual bug?

        Still wondering (& vowing to step even more cautiously anywhere near STDIO magic),
        My thanx to almut & ikegami for the work below!
        less confused now (& busy scribbling away two new-to-me debugging tips along with a link to their demonstration here)
        Peter

        For this reason, the filedescriptor is considered invalid from the PerlIO point of view

        But it's not, or at least not completely invalid. You can still seek using the handle and print to the handle without problem. For example, adding

        seek(STDOUT, -0, 2) or die $!; print STDOUT "abc\n";

        does indeed append "abc\n" to the file.

        It's more like Perl remembers the handle's original mode and doesn't realize it can read from it now.

        Update: I did a bit of Dumping and stracing of my own.

        There's is no difference in the IO objects. I'm now with you leaning towards a PerlIO problem.

        Seems that the "Bad file descriptor" message originates from Perl, not the system. Perl doesn't even attempt to read from STDOUT.

        $ cat a.pl use Devel::Peek; open(SAVOUT, '>&STDOUT') or die $!; close(STDOUT) if $ARGV[0]; open(STDOUT, '+>', "/tmp/stdout.log") or die $!; Dump(*STDOUT{IO}); @argv = qw(/bin/echo hello world); system(@argv); print SAVOUT "before=", tell(STDOUT), "\n"; seek(STDOUT, 0, 0) or die $!; print SAVOUT "after=", tell(STDOUT), "\n"; while (1) { my $rv = read STDOUT, $_, 8192; die $! if !defined($rv); last unless $_; print SAVOUT "stdout=", $_; } print SAVOUT "at end=", tell(STDOUT), "\n"; close STDOUT; $ diff -u <(strace perl a.pl 0 2>&1) <(strace perl a.pl 1 2>&1) | less ... lseek(1, 0, SEEK_SET) = 0 lseek(1, 0, SEEK_CUR) = 0 -[ code to read locale-dependent version of error message] -write(2, "Bad file descriptor at a.pl line"..., 37Bad file descriptor + at a.pl line 17. -) = 37 +read(1, "hello world\n", 4096) = 12 +read(1, "", 4096) = 0 +close(1) = 0 -write(3, "before=0\nafter=0\n", 17before=0 +write(3, "before=0\nafter=0\nstdout=hello wo"..., 46before=0 after=0 -) = 17 +stdout=hello world +at end=12 +) = 46 close(3) = 0 -exit_group(9) = ? -Process 4028 detached +exit_group(0) = ? +Process 4032 detached
      No, sorry

      (You asked if I could explain. I can't. I don't know why it behaves as it does.)

Re^2: reading from a file after a seek isn't working for me
by samwyse (Scribe) on Oct 21, 2009 at 21:07 UTC
    Great! Thanks!