Re: Support for hash comments on a line
by Fastolfe (Vicar) on Nov 01, 2001 at 19:56 UTC
|
This will break if you use a # in your "real" data, but I generally would use something like this:
while (<>) {
chomp;
s/\s*#.*//; # Strip off whitespace and trailing comments
next if /^\s*$/; # Skip blank lines
&process_line_of_input($_);
}
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
When i run your code on my file:
hostname3 # this is a comment for hostname3
I get:
Use of uninitialized value in substitution (s///) at ./comment line 11
+, <FH> line 1.
my full code is:
#!/usr/bin/perl -w
use strict;
open(FH, "comments.txt")
or die "cant open file";
while (my $line = <FH>) {
chomp $line;
next if $line =~ /^$/;
$line = s/\s*\#.*//;
print "|$line|\n";
}
close(FH);
humbly -c | [reply] [Watch: Dir/Any] [d/l] [select] |
|
You want to use =~ instead of = when doing regexp substitions on a variable. You're basically doing this:
$line = ($_ =~ s/\s*\#.*//);
Since $_ is undefined here, you get that warning. | [reply] [Watch: Dir/Any] [d/l] [select] |
Re: Support for hash comments on a line
by jlongino (Parson) on Nov 01, 2001 at 20:14 UTC
|
Another way to do it is to take advantage of prematch
($`):
use strict;
while (<DATA>) {
chomp;
$_ = $` if /#/;
print "$_\n" if $_;
}
__DATA__
### Hello
99:88:77
100:11# This is a comment
abc:def ### Comments also
999
--Jim
Update: shortenned conditional after chomp;
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Just a thought, using the prematch and postmatch at all will slow down all regular expressions as the engine will have to save them for every regex in your program.
-Lee
"To be civilized is to deny one's nature."
| [reply] [Watch: Dir/Any] |
|
substr($string, 0, $-[0])
For example:
#!/usr/bin/perl -wT
use strict;
while (<DATA>) {
chomp;
$_ = substr($_, 0, $-[0]) if /#/;
print "$_\n" if $_;
}
__DATA__
### Hello
99:88:77
100:11# This is a comment
abc:def ### Comments also
999
-Blake
| [reply] [Watch: Dir/Any] [d/l] [select] |
Re: Support for hash comments on a line
by buckaduck (Chaplain) on Nov 01, 2001 at 23:14 UTC
|
How about a non-regex solution?
($line) = split /#/, $line;
buckaduck | [reply] [Watch: Dir/Any] [d/l] |
Re: Support for hash comments on a line
by sevensven (Pilgrim) on Nov 01, 2001 at 23:18 UTC
|
Your second code was close, but you've forgoten that a regular expression like .* is a greedy expression, it will match everything it can, and indeed a .* can match everything :-)
In your seconde example (
$line =~ s/(.*)\#.*/$1/g;) you should change (.*) to (.*?) and it will work as you wanted.
Adding the ? makes the previous .* match the smallest possible pattern and leave the rest of the input to the rest of the regexp.
This is explained in greater detail in perlre Perl Regular Expressions.
HTH, going back to building perl with thread support.
| [reply] [Watch: Dir/Any] [d/l] [select] |
Re: Support for hash comments on a line
by Fletch (Bishop) on Nov 01, 2001 at 20:48 UTC
|
while( <FOO> ) {
s/\s*#.*$//;
next if /^\s*$/;
...
}
Anything fancier than that you might want to look into
Parse::RecDescent and build a smarter parser, or
maybe use AppConfig. Or maybe go to an XML based
format and let XML::Parser worry about all of the
parsing and comments and what not.
| [reply] [Watch: Dir/Any] [d/l] |
Re: Support for hash comments on a line
by mr_mischief (Monsignor) on Nov 02, 2001 at 03:25 UTC
|
This will get rid of pretty much all trailing comments, except those which contain quote characters. This
strikes a balance with not trying to strip hash marks
that are in quotes as data.
while( <> ) {
s/\s+#[^'"]+\z//;
}
So, if you can guarantee that no hash marks are data
except in quotes and that there are no quotes in your
comments, this should be a simple way to do it that
makes reasonable accommodations for using the hash mark
in data. If you have any other ways to wrap data in
quote-like characters, just add them to the negated
character class and keep them out of the comments.
Update: Fixed a typo. 2002/05/02 | [reply] [Watch: Dir/Any] [d/l] |
Re: Support for hash comments on a line
by FoxtrotUniform (Prior) on Nov 01, 2001 at 22:40 UTC
|
Note: untested code follows
If you know that #s won't appear in your data, you can
write:
$line =~ s/^([^#]*)#.*$/$1/;
If #s can appear in quoted strings, life gets a little
more complex:
$line =~ s/^
( # grab this stuff in $1
(
[^#"]* # prefix of non-#s, non-"s
(\" # start of string
[^\"]* # content of string
\")? # end of string
[^#"]* # suffix
)* # grab many prefix-string-suffixes
)
\# # start of comment
.*
$
/$1/x;
(At this point, you may be better off using one of the
Text modules, and if the input's really hairy,
Parse::RecDescent.)
Update: Er, that second regex is
s/.../$1/x;, not s/.../x;. Doh!
--
:wq | [reply] [Watch: Dir/Any] [d/l] [select] |
Re: Support for hash comments on a line (Why use a regex at all?)
by demerphq (Chancellor) on Nov 02, 2001 at 18:20 UTC
|
Not real sure why everyone posted regex solutions here, use substr and index. Much faster.
while (<DATA>) {
if ( ( my $p = index( $_, "#" ) ) > -1 ) { substr( $_, $p, -1, ""
+) }
next if /^\s*$/;
print;
}
__DATA__
#comment
this is test#comment
#comment
this is test #comment
this is test #comment
this is test #comment
A regex in any form (split, s/// or m//) is overkill for this task. (Assuming of course that # can't appear in the real data)
Yves / DeMerphq
--
Have you registered your Name Space? | [reply] [Watch: Dir/Any] [d/l] |
|
Unless you're dealing with a large number of lines here, the performance penalty of going with a regex is, in my opinion, inconsequential compared with the added readability of code that uses it. No one skimming the code above is going to have the slightest idea what it does without studying it.
Though don't get me wrong, if your requirements are such that you're going to be doing this sort of processing on a lot of data, and performance is a factor, this is one of many optimizations that can be made to squeeze speed out of the algorithm.
| [reply] [Watch: Dir/Any] |