in reply to Read offset into other files

As a one-liner:

perl -s -nle"BEGIN{open BIG};($undef,$l)=split;read BIG,$data,$l;print + $data" -- -BIG=bigfile.dat index.dat > outfile.dat

All one one line. Switch "s to 's on *nix.

Update: A slightly shorter version

perl -sple"BEGIN{open BIG};($undef,$l)=split;read(BIG,$_,$l)" -- -BIG=bigfile.dat index.dat >outfile.dat

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^2: Read offset into other files
by gio001 (Acolyte) on Oct 18, 2008 at 15:35 UTC
    It is amazing how simple things become if you know how to use the right tools!
    Will this one liner process all the entries in the index file gathering and putting out strings from the bigfile without a need for a loop or a while?
    Please help me understand. Also there is no use of the offset value, am I reading this right, will the read head keep moving automatically inside the bigfile, forward from the last read?
    Thanks again.
      Will this one liner process all the entries in the index file gathering and putting out strings from the bigfile without a need for a loop or a while?

      Yes. The loop is invoked by the -p option on the command line. This tells perl to read the file given as a command line argument (index.dat above) into $_ and then print it to stdout.

      The code in the -e takes the contents of $_, splits it to extract the length, reads that number of bytes from the filehandle BIG, overwriting $_. This is then (implicitly) printed with a newline due to the -l switch, and redirected to the output file by the command line processor.

      The -s switch tells perl to parse the command line for options in the form -XXX=yyy. This creates a variable named XXX with the value yyy.

      The BEGIN{} block uses a one-arg open to open the file for input (using the value of the BIG as the filename and storing the filehandle the glob *BIG).

      The -- is required to allow Perl to differentiate between the options intended for use by perl itself, and those (-BIG=bigfile.dat) intended for use by the "script" (-e"...") in this case.

      See perlrun for a better explanation of all the switched than I can give.

      Also there is no use of the offset value, am I reading this right, will the read head keep moving automatically inside the bigfile, forward from the last read?

      Exactly. You are essentially just reading the file sequentially. The only extra information you need, is how many bytes constitute each record.

      Perl may have some weird nooks and crannies, but they're all there for very good reasons :)


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.