irDanR has asked for the wisdom of the Perl Monks concerning the following question:

'ello Monks!
I'm still quite the noob, as will soon be evident. In being tasked with modifying an ancient parser I soon noticed that it was using local $/=undef. Which I gathered is generally frowned upon. This is a stripped down version of the original file:

local $/=undef; open (NOTFOUND, "<transact.QDX"); open (NOTOUT, ">NotOut.txt"); my $lenOfFile = (stat(NOTFOUND)) [7]; while ($position <= $lenOfFile - 64) { sysread NOTFOUND, $record,64; chomp($record = $record); $opcode = unpack ("H*", substr($record, 0,1)); $subcode = unpack ("H*", substr($record, 1,1)); $position +=64; sysseek NOTFOUND, $position, 0; if ($opcode eq "60" and $subcode eq "01" ){ $plu = unpack("H*", substr($record,2,7)); syswrite NOTOUT, $plu . "\t" . $ymd . "\r" . "\n"; } }

In my situation there isn't a performance issue to worry about, I just wanted to do things 'right' from the get-go.

while ($position <= $lenOfFile - 64) { sysread NOTFOUND, my $record,64; push(@records, chomp($record)); $position += 64; sysseek NOTFOUND, $position, 0; } foreach my $currentrecord (@records) { $opcode = unpack ("H*", substr($currentrecord,0,1)); $subcode = unpack ("H*", substr($currentrecord, 1,1)); syswrite DEBUG, $opcode . "\t" . $subcode . "\n"; if ($opcode eq "60" and $subcode eq "01" ) { $plu = unpack("H*", substr($currentrecord,2,7)); syswrite NOTOUT, $plu . "\t" . $ymd . "\n"; }

In using my method I get blank output. The debug file I created shows that the opcode is always 30, and the subcode is just a box in notepad(i'm only using windows because i'm forced to at work, promise!). Funny thing is, there is no opcode 30 in the QDX file being read.

My fragile perl reality is crumbling around me. The two blocks of code seem like they should be accomplishing nearly the same thing. Perhaps my understanding of "local $/=undef" isn't well grounded? Is it a peculiarity with QDX files? What is going on?

Replies are listed 'Best First'.
Re: Avoid using local $/=undef?
by moritz (Cardinal) on Nov 12, 2009 at 23:25 UTC
    I don't see how this all relates to $/. That variable controls what a <FILEHANDLE> or readline(FILEHANDLE) considers a line ending, which you don't use.

    Then you use chomp which also respects $/, but since $/ is undef, it doesn't do anything.

    Perl 6 - links to (nearly) everything that is Perl 6.

      Thanks for the information moritz. I suppose this doesn't all directly relate to $/. Forgive my ignorance. I recall reading somewhere that using local $/=undef meant telling perl to read the entire file at once. Perhaps that's true and also has nothing to do with my issue. My hunch must have been way off.

      The second block of code is what I wrote, thinking it better practice to not read/process the entire file at once. Instead I wanted to store the file in @records and then process them one by one.

      Is there any obvious reason that I wouldn't get the same results from either set of code? I know they obviously differ in the way they're processing the file. What I don't understand is why the output of mine is so far off the mark.

      Any help would be greatly appreciated!

        I recall reading somewhere that using local $/=undef meant telling perl to read the entire file at once.

        Almost. It controls what readline (or <$fh;>) considers a line. Your code uses sysread, so that's irrelevant.

        "The second block of code is what I wrote, thinking it better practice to not read/process the entire file at once."

        That is usually correct, although there are sometimes (in my experience very rare) occassions where reading an entire file into memory allows better processing / data-munging / manipulation, etc... perhaps to avoid having to seek forwards/backwards through a file and simply just shove it into memory.

        (untested:) I don't think there is any real significant performance gain/tradeoff with either solution, unless you need to read non-contiguous chunks of data repeatedly/excessively - then the all-in-memory option will be better.

        But if you're only ever interested in the current "line" or "$/" chunk of data, then the line-by-line processing method will likely only ever need a couple of MB of memory, whereas the entire file in memory will require space proportionate to the size of the file, and limit you to the amount of RAM you've got available to use.

Re: Avoid using local $/=undef?
by bart (Canon) on Nov 13, 2009 at 00:19 UTC
    I soon noticed that it was using local $/=undef. Which I gathered is generally frowned upon.
    Eh? That's news to me.

    Anyway, I see you're trying to read 64 bytes at a time, so you might think of setting $/ to a ref to a scalar containing the value 64. This will work:

    local $/ = \64;
    as will
    my $length = 64; local $/ = \$length;

    After that, you can just read from the file with <HANDLE>, but I recommend calling binmode on the handle first, unless, maybe, you're absolutely sure you will only ever run this on Unix-like systems.

    But I'd still do it anyway.

      Problem with setting local $/ = \64 is that it is easy to forget the \ (which at least one monk has done in the past).

      read might be a better bet.
Re: Avoid using local $/=undef?
by desemondo (Hermit) on Nov 13, 2009 at 00:19 UTC
    Based on my limited understanding local $/ = undef when used inside an innermost block of the smallest scope possible is acutally perfectly fine. If you just say $/ = undef in a global scope, then you can run into problems with other IO - usually when some other modules your using are doing IO. This is because $/ is a global variable.

    Also be aware that localising a variable doesn't preserve its value into the localised copy; the localised copy is undefined. As this example crudely demonstrates:
    c:\WorkingFolder>perl -e "$/ = 5; print qq{$/\n}; {local $/; print qq +{$/\n}; $/ = 2; print qq{$/\n};} print $/" __END__ Output is: 5 #initally set $/ to something we can see when we print it #2nd print, proves localsised $/ is set to undef 2 #localised copy of $/ set to 2 5 #inner scope where $/ was localised has ended. Original $/ is + reinstated by Perl
Re: Avoid using local $/=undef?
by graff (Chancellor) on Nov 13, 2009 at 03:18 UTC
    I think the only time folks would "frown upon" the use of  local $/; (which sets a local copy of the variable to "undef" without having to actually mention undef), would be if this were done at a point in your script that has global scope, which sort of defeats the purpose of using the "local" keyword.

    (Well, some nit-pickers would also object to doing this when you don't actually use "readline()" or the diamond operator.)

Re: Avoid using local $/=undef?
by markkawika (Monk) on Nov 13, 2009 at 00:39 UTC

    I know I'm a bit late to the party, but I'd like to recommend not using $/ as the variable name. If you use English; at the beginning of your program, you can change all occurrances of $/ to $INPUT_RECORD_SEPARATOR which is far more readable.

      Yes! Perl doesn't have enough magic variables to remember! use English; will more than double the number of magical, action-at-a-distance variables a program can have. And think of all the fun you can have with:
      use strict; use English; my $INPUT_RECORD_SEPARATOR = "%"; # Hi, mom! They tell me to use stri +ct and 'my' my variables!

      Too bad there aren't use French;, use Russian; or use Arabic; options.

Re: Avoid using local $/=undef?
by shmem (Chancellor) on Nov 13, 2009 at 17:50 UTC
    In being tasked with modifying an ancient parser I soon noticed that it was using local $/=undef. Which I gathered is generally frowned upon.

    It is not. It is frowned upon if it is used in a nonsensical way, like in the code snippet you posted.

    Used at toplevel local has no effect, since there is no outer scope: you could just as well modify the global $/ instead. And if readline (or the diamond operator <>) isn't, but sysread used instead, setting $/ doesn't do anything useful.

    sysread reads bytes. A subsequent chomp on the result only removes $/ (in your case: nothing, since $/ is undef) if $/ happens to be at the end of the resulting string - which will only be the case if your input file has fixed record length.

    Instead of avoiding to use local $/ use it right - inside a BLOCK, so after leaving that block (at runtime) it gets its previous value:

    { local $/; # undef by default if localized # file slurp here } # after the block, original $/ is restored

    Try e.g. perl -le "{ local $/ = '@'; print ord $/,': ',$/ } print ord $/,': ',$/"

    Result:

    64: @ 10:

    Read up chomp. That line

    push(@records, chomp($record));

    doesn't do what you expect. with $/ set to undef your array @records will contain only empty (undef) elements.

    See also my/local, space/time (was: Re: The difference between my and local).