Andrew_Levenson has asked for the wisdom of the Perl Monks concerning the following question:
use strict; use warnings; my @array; my $i; my @new; my $nfile; my $file=<>; open FILE, "<$file"; while(<FILE>){ push @array, $_;} close FILE; for(@array){ ($i)= grep { m/^#/ } $_; push @new, $i; } $nfile=<>; open FILE, ">$nfile"; while(<FILE>){ for(@new){ print FILE " \n\n $_ \n\n "; } }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: File read and strip
by ikegami (Patriarch) on Nov 14, 2006 at 19:58 UTC | |
Fix: (one-liner) ( Added missing >. Thanks Hofmator. )
Fix: (program -- memory efficient)
Fix: (program -- (more) compact)
| [reply] [d/l] [select] |
by Hofmator (Curate) on Nov 15, 2006 at 07:18 UTC | |
Fix: (one-liner) Should be perl -ne "print unless /^#/" infile > outfile. Apart from that all very sound advice, ikegami++! -- Hofmator Code written by Hofmator and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk. | [reply] [d/l] |
|
Re: File read and strip
by Joost (Canon) on Nov 14, 2006 at 19:50 UTC | |
Will set $file to the first line (including the newline) read from either STDIN or the file specified on the command line. This is almost certainly not what you want. You probably want Which, outside of subroutines can be abbreviated to or you can use By the way, it's always a good idea to check open() for errors: update: there are other errors in your code, too. Note that grep() returns a LIST, for one.
| [reply] [d/l] [select] |
by Andrew_Levenson (Hermit) on Nov 14, 2006 at 19:56 UTC | |
(I don't mean to be obnoxious, i'm trying to make it easier by not just saying "I don't get it.") | [reply] |
by Joost (Canon) on Nov 14, 2006 at 20:11 UTC | |
now, @new_list contains every value in @list that matches /regular expression/. Your code: Will set $i to undefined if $_ doesn't start with a # character, and then push it onto @new. In other words you'll have undefined entries in @new where you probably wanted to skip those lines. You might want to do some reading in the excellent perl documentation updated: fixed link | [reply] [d/l] [select] |
by Andrew_Levenson (Hermit) on Nov 14, 2006 at 19:54 UTC | |
The files are user-input, so the $file=<> has worked fine for me in the past. Sorry, but I have no clue what else you said besides that. :( What does shift @ARGV do? What is it calling from? what does the $! in the die function do? Sorry if these are stupid questions, and thanks. | [reply] [d/l] [select] |
by Joost (Canon) on Nov 14, 2006 at 20:03 UTC | |
I was asssuming you wanted to call your program like this: anyway,
will read a single line from the file(s) as specified in @ARGV. in other words, $whatever will not be the filename but the next (or first) line of the file you passed as an argument. if you don't pass any arguments <> will read from STDIN, which means it will usually wait for you to enter a line. See perlop (quoted below:) Note that lines read usually end in a newline character ("\n") while filenames usually do not.
$! will contain the last filesystem error, which will make it a lot easier to see what's going wrong if you pass the wrong filename to open(), or don't have permissions etc. See also perlvar.
| [reply] [d/l] [select] |
by runrig (Abbot) on Nov 14, 2006 at 20:12 UTC | |
The files are user-input, so the $file=<> has worked fine for me in the past.I think he was assuming that the file name was coming from the command line. @ARGV is the array of command line arguments, so "shift @ARGV" removes and returns the first argument. If you want the user to enter a file name, then it would be a good idea to prompt for it, e.g.:
what does the $! in the die function do?See perlvar. It is the error message for system/library calls. You want to check to see if open fails, and if it does, see why it failed. | [reply] [d/l] |
|
Re: File read and strip
by rhesa (Vicar) on Nov 14, 2006 at 20:05 UTC | |
The other problem is with the grep: you're only collecting lines that start with a #, although you say you want to do the opposite. You seem to be doing a lot of unnecessary looping too. I suggest the following as a more streamlined alternative:
| [reply] [d/l] [select] |
|
Re: File read and strip
by GrandFather (Saint) on Nov 14, 2006 at 20:04 UTC | |
First thing I notice is I didn't notice $i being declared. Stacking up lines like that hides stuff. In this case it's even worse because $i should be declared in a much small scope - the for loop. Next thing is that generally three prarameter open is prefered and open should always be checked:
A side issue is:
The best you can hope for from that is $i will be undef if $_ starts with a #. That undef gets pushed into @new however. Probably what you really wanted was something like:
However your real issue is the while loop. You've just created FILE. It's empty. How many times it while (<FILE>) going to iterate? That is, how many of the 0 lines available in the newly created file is it going to read? The while loop is not needed at all. The for loop inside it could be replaced by:
or remain as it is. DWIM is Perl's answer to Gödel | [reply] [d/l] [select] |
|
Re: File read and strip
by markh (Scribe) on Nov 14, 2006 at 19:56 UTC | |
What about replacing that with:
Mine might be a bit less compact, but it is much more readable and easier to see what it happening there. Also it appears to me that when you are writing back to your file, you are putting in 4 extra linebreaks. Is that intentional? | [reply] [d/l] [select] |
by Andrew_Levenson (Hermit) on Nov 14, 2006 at 20:00 UTC | |
What does using the {parameter?} "my $line" before the (@array) do? And, let's see if I have any idea what's going on, the second line skips a line if it starts with a #? Thanks! | [reply] |
by markh (Scribe) on Nov 14, 2006 at 20:30 UTC | |
assigns the value from the next item in @array to the variable $line. I find this makes it eaiser to see what you are doing, rather than using $_, which I find non-intuitive. Yes, you are correct, that next line of code skips anything that starts with a #, so that only lines from your file that are not comments are inserted into the array that you are intending to send to the output file. | [reply] [d/l] |
|
Re: File read and strip
by johngg (Canon) on Nov 14, 2006 at 20:48 UTC | |
The \s means a white-space character and the * iterator means zero or more of whatever preceded it. So the above pattern means at the beginning of the string (your line in this context) match zero or more white-space characters followed by a hash. I hope this is of use. Cheers, JohnGG | [reply] [d/l] [select] |
|
Re: File read and strip
by jonadab (Parson) on Nov 15, 2006 at 01:00 UTC | |
Don't mind me, I'm just in a bit of a weird mood, so pardon me while I have a little twisted fun at the expense of your code... Read more... (465 Bytes)
Ah. Hopefully that's out of my system now. So, then, on to fixing it... First off, the print statement is never happening because <FILE> does not return anything very useful when the file is opened for output. Second, the test in grep is probably backwards from what you want.
This is still not optimal. For one thing, the first map (corresponding to your first while loop in the original code) is obviously superfluous. (Your while loop in the original wasn't necessary either, but it's more obvious to me now.) Second, there's still the weird double-indirection surrounding the filenames, which others have commented about. Third, I prepended a true value to force the print statement to occur once, but the file read there is still nonsense. Finally, there are still more variables than are strictly necessary; some of them are only used once -- well, twice: once in an assignment, and once again on the very next line to use the value; this amounts basically to using them once to turn one line into two, and that isn't necessary...
If you're willing to add a second filehandle you can get rid of the array, and the whole thing could be reduced to more-or-less a one-liner...
Look, ma, no variables to declare. If you're willing to use standard input and output and let the user specify the filenames with redirection operators, it becomes a veritable one-liner: print map{" \n\n $_ \n\n "}grep{not m/^#/}<STDIN>;strict and warnings seem unnecessary on anything that short. Someone will probably come along in a moment and explain how to use command-line flags to shorten it even further. Sanity? Oh, yeah, I've got all kinds of sanity. In fact, I've developed whole new kinds of sanity. You can just call me "Mister Sanity". Why, I've got so much sanity it's driving me crazy. | [reply] [d/l] [select] |
|
Re: File read and strip
by Jasper (Chaplain) on Nov 15, 2006 at 10:44 UTC | |
dumps all the comments in @new, not the lines without comments. The corrected solution has been shown, of course. Also, the OP is reading the file line by line in one loop, then extracting from that file into an array in another loop, then looping over that array to print to the outfile. This can all be done in one loop (like the one liner) I'd also add in a whitespace element to the regex, because I rarely start comments at the beginning of the line - I usually indent them with the rest of the code, and I doubt I'm the only one. so /^\s+#/ Jasper also:
| [reply] [d/l] [select] |