Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

File read and strip

by Andrew_Levenson (Hermit)
on Nov 14, 2006 at 19:40 UTC ( [id://584038]=perlquestion: print w/replies, xml ) Need Help??

Andrew_Levenson has asked for the wisdom of the Perl Monks concerning the following question:

Okay, i'm easing myself back into programming in general; it has been a while.
This little program is supposed to help me recreate a larger program I had written: I want it to read the program file (as a .txt doc) and re-print it in another file, sans the comments that are on each line.
However, nothing is happening and I can't figure out why. Any ideas?
use strict; use warnings; my @array; my $i; my @new; my $nfile; my $file=<>; open FILE, "<$file"; while(<FILE>){ push @array, $_;} close FILE; for(@array){ ($i)= grep { m/^#/ } $_; push @new, $i; } $nfile=<>; open FILE, ">$nfile"; while(<FILE>){ for(@new){ print FILE " \n\n $_ \n\n "; } }

Thanks in advance!

Andrew Levenson

Replies are listed 'Best First'.
Re: File read and strip
by ikegami (Patriarch) on Nov 14, 2006 at 19:58 UTC
    • You are attempting to read (while(<FILE>){) from your output file. You'll never reach the print.
    • push @new, $i; adds undef for the lines grep matched.
    • ($i)= grep { m/^#/ } $_; keeps the comment lines instead of the lines without comments.
    • for(@array){ ($i)= grep { m/^#/ } $_;  push @new, $i; }
      should be
      @new = grep { !m/^#/ } @array;
    • You don't check if your open succeeded
      open(FILE, "<$file") or die("Unable to open input file \"$file\": $!\n");
      open(FILE, ">$nfile") or die("Unable to create output file \"$nfile\": $!\n");
    • You should use the 3-arg open whenever possible.
      open(FILE, '<', $file)
      open(FILE, '>', $nfile)
    • You should use lexicals whenever possible.
      open(my $fh_in, '<', $file) (while (<$fh_in>))
      open(my $fh_out, '>', $nfile) (print $fh_out ...)
    • Shouldn't $file=<>; be $file=@ARGV[0];?
      Shouldn't $nfile=<>; be $nfile=@ARGV[1];?
      Parameters are usually passed as parameters instead of via STDIN. If you do mean to read from STDIN, give the user a prompt, and use <STDIN> instead of <>.
    • No checking to see if $file and $nfile were specified. (They could be undef.)
    • The trailing newline is not removed from $file and $nfile.
    • Why read the whole file into memory?

    Fix: (one-liner) ( Added missing >. Thanks Hofmator. )

    perl -ne "print unless /^#/" infile > outfile

    Fix: (program -- memory efficient)

    use strict; use warnings; die("usage: $0 {infile} {outfile}\n"; if @ARGV != 2; my ($file_in, $file_out) = @ARGV; open(my $fh_in, '<', $file_in) or die("Unable to open input file \"$file_in\": $!\n"); open(my $fh_out, '>', $file_out) or die("Unable to create output file \"$file_out\": $!\n"); while (<$fh_in>) { if (!/^#/) { print $fh_out $_; } }

    Fix: (program -- (more) compact)

    use strict; use warnings; die("usage: $0 {filein} {fileout}\n"; if @ARGV != 2; my ($file_in, $file_out) = @ARGV; open(my $fh_in, '<', $file_in) or die("Unable to open input file \"$file_in\": $!\n"); open(my $fh_out, '>', $file_out) or die("Unable to create output file \"$file_out\": $!\n"); print $fh_out (grep !/^#/, <$fh_in>);
      Fix: (one-liner)

      Should be perl -ne "print unless /^#/" infile > outfile. Apart from that all very sound advice, ikegami++!

      -- Hofmator

      Code written by Hofmator and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

Re: File read and strip
by Joost (Canon) on Nov 14, 2006 at 19:50 UTC
    my $file=<>;
    Will set $file to the first line (including the newline) read from either STDIN or the file specified on the command line. This is almost certainly not what you want.

    You probably want

    my $file = shift @ARGV; # get first argument my $nfile = shift @ARGV; # get next argument
    Which, outside of subroutines can be abbreviated to
    my $file = shift; my $nfile = shift;
    or you can use
    my ($file,$nfile) = @ARGV;
    By the way, it's always a good idea to check open() for errors:
    open FILE,">$file" or die "Can't open $file: $!"
    update: there are other errors in your code, too. Note that grep() returns a LIST, for one.

      Is returning a list bad? What does that mean in relation to what i'm trying to do?

      (I don't mean to be obnoxious, i'm trying to make it easier by not just saying "I don't get it.")
      I'm sorry, but... what?
      The files are user-input, so the $file=<> has worked fine for me in the past.

      Sorry, but I have no clue what else you said besides that. :(
      What does shift @ARGV do? What is it calling from? what does the $! in the die function do?

      Sorry if these are stupid questions, and thanks.
        shift @ARGV removes the first element from @ARGV and returns it. @ARGV contains all the arguments provided to your program.

        I was asssuming you wanted to call your program like this:

        my_program inputfile outputfile
        anyway,

        my $whatever = <>;
        will read a single line from the file(s) as specified in @ARGV. in other words, $whatever will not be the filename but the next (or first) line of the file you passed as an argument. if you don't pass any arguments <> will read from STDIN, which means it will usually wait for you to enter a line. See perlop (quoted below:) Note that lines read usually end in a newline character ("\n") while filenames usually do not.
        while (<>) { ... # code for each line } is equivalent to the following Perl-like pseudo code: unshift(@ARGV, ’-’) unless @ARGV; while ($ARGV = shift) { open(ARGV, $ARGV); while (<ARGV>) { ... # code for each line } }

        $! will contain the last filesystem error, which will make it a lot easier to see what's going wrong if you pass the wrong filename to open(), or don't have permissions etc. See also perlvar.

        The files are user-input, so the $file=<> has worked fine for me in the past.
        I think he was assuming that the file name was coming from the command line. @ARGV is the array of command line arguments, so "shift @ARGV" removes and returns the first argument. If you want the user to enter a file name, then it would be a good idea to prompt for it, e.g.:
        print "Enter filename: "; chomp(my $file = <STDIN>);
        what does the $! in the die function do?
        See perlvar. It is the error message for system/library calls. You want to check to see if open fails, and if it does, see why it failed.
Re: File read and strip
by rhesa (Vicar) on Nov 14, 2006 at 20:05 UTC
    Your biggest problem is in the last while(<FILE>) loop: you're trying to read from a file that you just opened for writing.

    The other problem is with the grep: you're only collecting lines that start with a #, although you say you want to do the opposite.

    You seem to be doing a lot of unnecessary looping too. I suggest the following as a more streamlined alternative:

    use strict; use warnings; print "Input file name? "; my $name_in = <>; chomp $name_in; print "output file name? "; my $name_out = <>; chomp $name_out; open my $in, '<', $name_in or die "error opening $name_in: $!"; open my $out , '>', $name_out or die "error opening $name_out: $!"; print $out grep { !/^#/ } <$in>;
Re: File read and strip
by GrandFather (Saint) on Nov 14, 2006 at 20:04 UTC

    First thing I notice is I didn't notice $i being declared. Stacking up lines like that hides stuff. In this case it's even worse because $i should be declared in a much small scope - the for loop.

    Next thing is that generally three prarameter open is prefered and open should always be checked:

    open FILE, '<', $file or die "Failed to open $file: $!";

    A side issue is:

    ($i)= grep { m/^#/ } $_;

    The best you can hope for from that is $i will be undef if $_ starts with a #. That undef gets pushed into @new however. Probably what you really wanted was something like:

    ! m/^#/ and push @new, $_ for @array;

    However your real issue is the while loop. You've just created FILE. It's empty. How many times it while (<FILE>) going to iterate? That is, how many of the 0 lines available in the newly created file is it going to read?

    The while loop is not needed at all. The for loop inside it could be replaced by:

    print FILE " \n\n $_ \n\n " for @new;

    or remain as it is.


    DWIM is Perl's answer to Gödel
Re: File read and strip
by markh (Scribe) on Nov 14, 2006 at 19:56 UTC
    I think the problem is in your section of:
    for(@array){ ($i)= grep { m/^#/ } $_; push @new, $i; }
    What about replacing that with:
    foreach my $line (@array) { next if ($line =~ /^#/); push @new, $line; }


    Mine might be a bit less compact, but it is much more readable and easier to see what it happening there.

    Also it appears to me that when you are writing back to your file, you are putting in 4 extra linebreaks. Is that intentional?
      The line breaks are intentional to make the final file more readable when I format it.
      What does using the {parameter?} "my $line" before the (@array) do?
      And, let's see if I have any idea what's going on, the second line skips a line if it starts with a #?

      Thanks!
        The line:
        foreach my $line (@array) {
        assigns the value from the next item in @array to the variable $line. I find this makes it eaiser to see what you are doing, rather than using $_, which I find non-intuitive.

        Yes, you are correct, that next line of code skips anything that starts with a #, so that only lines from your file that are not comments are inserted into the array that you are intending to send to the output file.
Re: File read and strip
by johngg (Canon) on Nov 14, 2006 at 20:48 UTC
    A minor point to bear in mind is that the comments you are removing might be indented to align with indented code. The regular expression m{^#} will only match comments that are right at the beginning of the line because the caret ^ is the metacharacter used in regexen to anchor the match to the beginning of a string. (To anchor a match to the end of a string use the dollar $). This modified pattern copes with optional spaces before the hash

    m{^\s*#}

    The \s means a white-space character and the * iterator means zero or more of whatever preceded it. So the above pattern means at the beginning of the string (your line in this context) match zero or more white-space characters followed by a hash.

    I hope this is of use.

    Cheers,

    JohnGG

Re: File read and strip
by jonadab (Parson) on Nov 15, 2006 at 01:00 UTC

    Don't mind me, I'm just in a bit of a weird mood, so pardon me while I have a little twisted fun at the expense of your code...

    Ah. Hopefully that's out of my system now.

    So, then, on to fixing it... First off, the print statement is never happening because <FILE> does not return anything very useful when the file is opened for output. Second, the test in grep is probably backwards from what you want.

    use strict; use warnings; my $file=<>; open FILE, '<', $file; my @new = map { grep { not m/^#/ } } <FILE>; close FILE; my $nfile=<>; open FILE, '>', $nfile; print FILE map{" \n\n $_ \n\n "}@new for 1, <FILE>;

    This is still not optimal. For one thing, the first map (corresponding to your first while loop in the original code) is obviously superfluous. (Your while loop in the original wasn't necessary either, but it's more obvious to me now.) Second, there's still the weird double-indirection surrounding the filenames, which others have commented about. Third, I prepended a true value to force the print statement to occur once, but the file read there is still nonsense. Finally, there are still more variables than are strictly necessary; some of them are only used once -- well, twice: once in an assignment, and once again on the very next line to use the value; this amounts basically to using them once to turn one line into two, and that isn't necessary...

    use strict; use warnings; open FILE, '<', scalar <>; my @new = grep { not m/^#/ } <FILE>; close FILE; open FILE, '>', scalar <>; print FILE map{" \n\n $_ \n\n "}@new;

    If you're willing to add a second filehandle you can get rid of the array, and the whole thing could be reduced to more-or-less a one-liner...

    use strict; use warnings; open INFILE, '<', scalar <>; open OUTFILE, '>', scalar <>; print OUTFILE map{" \n\n $_ \n\n "} grep { not m/^#/ } <INFILE>;

    Look, ma, no variables to declare. If you're willing to use standard input and output and let the user specify the filenames with redirection operators, it becomes a veritable one-liner:

    print map{" \n\n $_ \n\n "}grep{not m/^#/}<STDIN>;

    strict and warnings seem unnecessary on anything that short. Someone will probably come along in a moment and explain how to use command-line flags to shorten it even further.


    Sanity? Oh, yeah, I've got all kinds of sanity. In fact, I've developed whole new kinds of sanity. You can just call me "Mister Sanity". Why, I've got so much sanity it's driving me crazy.
Re: File read and strip
by Jasper (Chaplain) on Nov 15, 2006 at 10:44 UTC
    I think the main problem with the OPs code was that he was doing the exact opposite of what he wanted, but no one really seems to have pointed that out explicitly, it seems to me.
    for(@array){ ($i)= grep { m/^#/ } $_; push @new, $i; }
    dumps all the comments in @new, not the lines without comments. The corrected solution has been shown, of course.

    Also, the OP is reading the file line by line in one loop, then extracting from that file into an array in another loop, then looping over that array to print to the outfile. This can all be done in one loop (like the one liner)
    while (<INFILE>) { next if /^#/; print OUTFILE $_; }
    I'd also add in a whitespace element to the regex, because I rarely start comments at the beginning of the line - I usually indent them with the rest of the code, and I doubt I'm the only one. so /^\s+#/

    Jasper

    also:
    perl -ne '/^#/ or print' perl -pe '$_ = "" if /^#/' perl -pe '$_ x= !/^#/'

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://584038]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (4)
As of 2024-04-19 13:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found