Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Inserting a file inside of another

by Anonymous Monk
on Feb 20, 2010 at 06:25 UTC ( [id://824321]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

One thing I like about perl is that even clods like me can manage to get things done. I recently needed a program to put bytes from one file inside of another, and I came up with the following:
#!/usr/bin/perl -w use strict; # This program will insert all or part of one # file inside of another use List::Util qw(min); # check the arguments if (@ARGV < 2 ) { die "Usage: insert.pl infile outfile [ startin [ startout [ endin +]]]\n"; } # open the files open ( FILOLD, '<', "$ARGV[0]" ) or die "Can't open $ARGV[0] for reading.\n"; open ( FILNEW, '+<', "$ARGV[1]" ) or die "Can't open $ARGV[1] for writing.\n"; # see what we want to copy my $start = 0; if (@ARGV >= 3 ) { seek(FILOLD, $ARGV[2], 0) or die "Can't seek to position $ARGV[2] in $ARGV[0].\n"; $start = $ARGV[2]; } if (@ARGV >= 4 ) { seek(FILNEW, $ARGV[3], 0) or die "Can't seek to position $ARGV[3] in $ARGV[1].\n"; } my $end = (stat(FILOLD))[7]; if (@ARGV == 5 ) { $end = min($ARGV[4], $end); } my $bytes = $end - $start; # do it while ($bytes > 0) { read FILOLD, my ($buffer), min(1024, $bytes); print FILNEW $buffer; $bytes -= 1024; } # close the files close FILOLD; close FILNEW;
This works fine, but I imagine it could be done much more elegantly, so I tought I would solicit ideas for improvement (especially if I missed any gotchas), or even general perl programming considerations. I know I could do a lot more as far as testing for valid parameters goes, but this is only for my own use, so I don't care too much about that.

Replies are listed 'Best First'.
Re: Inserting a file inside of another
by ikegami (Patriarch) on Feb 20, 2010 at 07:05 UTC
    • It seems to me it's not inserting at all. Rather, it replaces.
    • You should be using binmode on the handles.
    • 1024 is a tiny buffer. I'd use sysread and 64*1024.
    • @ARGV is used throughout instead of naming the arguments.
    • Globals are needlessly used for the file handles.

    Regarding the first point, if you were actually interested in tool that can both insert and replace, I would suggest

    cp_range in_file out_file [ in_spec [out_spec] ] in_spec: [in_start],[in_length] in_start defaults to 0 in_length defaults to -1 (rest of file) out_spec: [out_start],[in_length] out_start defaults to end of file out_length defaults to -1 (rest of file) Negative values for *_start and *_length behave as per substr.

      Could you spell out your objection(s) to the use of bareword file handles, at the top level scope of 20 line stand alone script. Beyond "needless"?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        If this is just a short throw-away script, there really is no problem with your file handles. However, I do not use them even on small scripts for several reasons.

        First, it is good to get in the habit of using variables for file handles; they will stand you in good stead when you start writing larger and more complex programs.

        Over time a small script may not stay small. The script may just need a slight modification to do something more, and then just another tiny improvement ...

        Finally, I like to reuse code I've already written wherever I can. So small snippets of code tend to get plugged into larger scripts (where it makes sends to do so.)

        It's hard to prove a negative (that's it's not needed). Perhaps you could offer a counter example (an example that it is needed) that would disprove what I said? I can't come up with any.

        Update: I misread, or rather forgot what I read by the time I composed my answer. (Still sleepy.)

        I'm sure I don't need to tell you the issues regarding the use of global variables, so perhaps you are suggesting small scripts should use global variables. I wouldn't mind if you explained that, because I don't follow your line of reasoning either.

        Given two possibilities, one that could give you problems, and one equally simple and clear but without the baggage, why would you pick the former?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://824321]
Approved by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (8)
As of 2024-04-18 08:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found