in reply to read ARGV ==> read on unopened filehandle

EEEEeeeek! Please don't do that...

I'm sorry ambrus, but you have been badly mislead. Even pg has missed the point here. If he had gone on to say that you should only use <> to read from ARGV, he would have been on the mark. The correction you offer is no correction at all. ARGV doesn't maintain its magical properties outside a diamond operator (<>) at all. And, if you aren't using those properties, then you are probably better off not using ARGV.

To see what I mean, put your read() in a while loop, save it in script.pl, dump some data in two test files, and try calling your script as script.pl test1.txt test2.txt. You will find that your script never gets to the data in test2.txt.

Generally speaking, if you find yourself trying to use a useless call like () = eof(); in order to fix something... there is almost certainly a better way. And, in this case, even if your proposed fix did work without introducing the potential bugs that it does, you really wouldn't be buying much for the price you paid with obfuscated code.

Speaking of obfuscation... using 1<<12 instead of 4096 is, uhm, perhaps a bit misguided. I'm not recommending this, but even 2**12 would be better! Using notation like that makes some sense when you are, for example, enumerating bit flags¹ but otherwise it's needlessly confusing.

Going back to the issue at hand, though, you almost never need to reference ARGV explicitly. You can do it in your <> for clarity, of course. And doing an explicit close(ARGV); is another reason. You could pass it to a function that used <>, but that should be avoided because it's just a bug waiting to happen when someone goes and changes the implementation of that function to use read() or something. So, the bottom line is, if you want to use ARGV, just use <> and be happy you don't have to write more code. If you really need read(), then you'll have to do a bit more work.

1. I.e. something like:

use constant F_FOO => 1 << 0; use constant F_BAR => 1 << 1; use constant F_BAZ => 1 << 2; use constant F_QUX => 1 << 3;
And so on... It's okay here because it's obvious what you are doing and why. And the shift contains useful information: the position of the bit associated with the flag.

Update: changed "><" to the intended "<>" in last para.

-sauoq
"My two cents aren't worth a dime.";

Replies are listed 'Best First'.
Re^2: read ARGV ==> read on unopened filehandle
by pg (Canon) on Sep 18, 2005 at 01:49 UTC
    "To see what I mean, put your read() in a while loop, save it in script.pl, dump some data in two test files, and try calling your script as script.pl test1.txt test2.txt. You will find that your script never gets to the data in test2.txt."

    I knew that you had clearly stated read(), but I still want to specifically mention that this is not an issue with <>, so that nobody got confused.

    Create two data files, test1.txt:

    test1 line1 test1 line2 test1 line3

    And test2.txt:

    test2 line1 test2 line2 test2 line3

    Use the same code that I mentioned in my first post in this thread, and run perl -w blah.pl test1.txt test2.txt, and you get:

    test1 line1 test1 line2 test1 line3 test2 line1 test2 line2 test2 line3

      Uh, yes pg... that's sort of the point. You are, I think, restating the obvious. Essentially, you are saying that ARGV works correctly when used correctly. That's true, of course. It just isn't exactly what this discussion was about.

      The original node was proposing a solution to using ARGV with read() and the point is that there is no good solution to that. That's not how ARGV should be used. Instead, it should be used with <> only.

      -sauoq
      "My two cents aren't worth a dime.";
      
Re^2: read ARGV ==> read on unopened filehandle
by ambrus (Abbot) on Sep 18, 2005 at 10:16 UTC

    You are saying that I shouldn't use read on ARGV. I think you are right here.

    I was using it as a short notation to avoid an explicit open. This was a one-liner. It didn't matter if it could handle only one file, as the one-liner didn't even have a loop: it called read once only. However, it's quite stupid to do this, as it's much easier to read from STDIN instead and use shell redirection. In a script (not a one-liner), it's of course better to use an explicit open.

    As for using 4096, I disagree with you. It doesn't really matter whether I use 2**12 or 1 << 12, they mean the same for me. (Except that 1 << 12 is a bit more verbose as it often needs to be parenthisized.) However, there's no way I'll use 4096, even in a constant definition like sub HEADER_SIZE { 4096 } instead of these. The reason is simple: once I wrote a script where I had to read a string of 256 records of 32 bytes each. I wrote 8092 instead of 8192, and I had a very bad time searching for the bug. So, I've learnt that if I want to read four kilobytes, I write 4*1024 or 4<<10 or 1<<12, but never calculate 4096 in my head.

      As a shorter notation I don't think you gained much unless I'm missing someing, which is entirly possible. From here it looks like you only saved 1 char give or take some spacing.

      () = eof(); read ARGV, $b, 1<<12 or die "cannot read ARGV: $!"; open(FH,shift) or die "cannot read file: $!";read FH, $b, 1<<12;

      ___________
      Eric Hodges
        open(FH,shift) or die "cannot read file: $!";read FH, $b, 1<<12;

        Why did you choose such an enormously long name for your filehandle? ;-)

        If you are really going for shortness though, you can still skip read() and set $/ to a reference. Granted that error handling isn't equivalent, but the following should do for use in a one-liner:

        $/=\4096;$b=<>;

        -sauoq
        "My two cents aren't worth a dime.";