in reply to Re: skip junk lines in csv before header
in thread skip junk lines in csv before header

Can you explain how line 17 works? I see <$fh> only reads up to the end of line each time. What causes the do statement to complete?

But God demonstrates His own love toward us, in that while we were yet sinners, Christ died for us. Romans 5:8 (NASB)

Replies are listed 'Best First'.
Re^3: skip junk lines in csv before header (updated)
by AnomalousMonk (Archbishop) on Jul 28, 2022 at 12:53 UTC
    do { local $/ = "fieldname"; <$fh> }; # read through "fieldname"

    The local $/ = "fieldname"; statement sets the $/ input record separator special variable (see perlvar) to the literal 'fieldname' string. This sets "paragraph" read mode: (no: see Update below) | This causes <$fh> to read the input stream from the beginning of the file (in this particular case) until the end of the first point at which the 'fieldname' string is encountered. Since the field names are apparently unambiguously known, this reads (almost) all the way through the first field name. The $/ variable is assigned local-ly in a do-block, so it returns to its previous value (the "\n" default in this case) at the end of the block.

    my $header_line = "fieldname" . <$fh>; # complete the line

    Since we (apparently) know the header line begins with 'fieldname', assign $header_line this initial value and complete reading the line with another <$fh>. This reads through the end of the line because $/ has restored to its original newline value.

    Update:

    This sets "paragraph" read mode: ...
    No, this is not "paragraph" (sometimes called "paragrep") read mode, it is normal read mode. See $/ in perlvar for a discussion of paragraph mode.

    In normal read mode, a file is read until just after the sequence of one or more characters in the $/ special variable is encountered (and including that sequence), or until the end of file if the $/ sequence is never encountered. Usually, $/ is a single "\n" (newline) character, but it can be any non-empty string. Whatever non-zero-length sequence of characters it may be, this is normal read mode.


    Give a man a fish:  <%-{-{-{-<

Re^3: skip junk lines in csv before header
by LanX (Saint) on Jul 28, 2022 at 13:16 UTC
    The do is actually redundant, he can use a basic block for localizing the $INPUT_RECORD_SEPARATOR aka $/

    use v5.12; use warnings; use Data::Dump qw/pp dd/; say pp $/; # show default say my $x = "HEADER\n" x 3 . "fieldname: BLA BLA\n" . join $/, 1..5; open my $fh, '<', \$x; # ignore anything prior to "fieldname" { local $/ = "fieldname:"; <$fh> }; say pp $/; # back to default say "-" x 10; say "fieldname:" . <$fh>; # till end of line

    "\n" HEADER HEADER HEADER fieldname: BLA BLA 1 2 3 4 5 "\n" ---------- fieldname: BLA BLA

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

      It occurs to me that there might be a use for a do-block. If no 'fieldname' string is present in the file, <$fh> will read to the end of the file. This will be a valid read. If the data of the read is returned as from a do-block, it can be tested to determine if a header was actually present. (The data could be returned from a "naked" block, but using a do-block is neater IMHO.)

      Win8 Strawberry 5.8.9.5 (32) Thu 07/28/2022 11:22:44 C:\@Work\Perl\monks >perl use strict; use warnings; use Data::Dump qw(dd); use constant FIELDNAME => 'fieldname'; open my $fh, '<', \<<END or die; no valid header present value1,value2 END my $got_header = do { local $/ = FIELDNAME; <$fh>; }; dd $got_header; # for debug $got_header =~ m{ \Q${ \FIELDNAME }\E \z }xms or die "no header"; my $header_line = FIELDNAME . <$fh>; # complete the line # and so on... ^Z "no valid header present\nvalue1,value2\n" no header at - line 16, <$fh> line 1.


      Give a man a fish:  <%-{-{-{-<

        well, yes if you want to capture the first 'readline', a do block is better, because of all the scoping.

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery