andreas1234567 has asked for the wisdom of the Perl Monks concerning the following question:

In the neverending quest for enlightenment, I came across this problem when converting an old log parser. The file two-lines-with-one-slash-each.txt was the smallest data sample I could reduce the original log to while preserving this behaviour.

Could someone please explain why this is happening? Is it a bug?
two-lines-with-one-slash-each.txt
/ /
html.entities.pl
use strict; use warnings; use HTML::Entities; print encode_entities(`cat two-lines-with-one-slash-each.txt`); __END__
Runtime:
$ perl html.entities.pl Unmatched [ in regex; marked by <-- HERE in m/([ <-- HERE / at (eval 1 +) line 2. while trying to turn range: "/ " into code: sub {$_[0] =~ s/([/ ])/$char2entity{$1} || num_entity($1)/ge; } at /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi/HTML/Ent +ities.pm line 428.
Environment:
$ perl -v | head -2 This is perl, v5.8.5 built for i386-linux-thread-multi $ uname -r 2.6.9-42.0.10.EL cpan[2]> install HTML::Entities HTML::Entities is up to date (1.35).


Andreas
--

Replies are listed 'Best First'.
Re: HTML::Entities - Unmatched [ in regex
by tinita (Parson) on May 30, 2007 at 14:36 UTC
    try
    print encode_entities(scalar `cat two-lines-with-one-slash-each.txt`);

    you're executing qx() in list context. the second argument to encode_entities is a character range, though.

      you're executing qx() in list context. the second argument to encode_entities is a character range, though.

      ++ because this just as obvious now as subtle enough to make me suspect that I wouldn't have spotted it easily in the first place. Indeed I don't like qx and try to avoid it if possible. Here, even if only working on a given OS, I don't see why one should rely on an external cat utility to do something that perl can handle perfectly fine itself: we would want a cheap idiom for slurping a file, and one can be provided by File::Slurp which is also known to be more efficient than the builtins. OTOH if not wanting to use an external module, I (would) do:

      print encode_entities do { open my $fh, '<', 'two-lines-with-one-slash-each.txt' or die "D'Oh! $!\n"; local $/; <$fh>; };

      If I wanted to stay even cheaper, I'd use @ARGV magic:

      print encode_entities do { local (@ARGV,$/)='two-lines-with-one-slash- +each.txt'; <> };
      you're executing qx() in list context.
      Fooled by context issues. Again..

      ++tinita for pointing out. I totally agree with blazar that using cat to open files is a bad idea next to File::Slurp or a proper 3 argument open with error handling. I just could not figure out what was going on. Thanks.

      Andreas
      --