in reply to Re: pattern match hangs on malformed UTF-8 input
in thread pattern match hangs on malformed UTF-8 input

Hmmm. I don't have an example that is pure perl. I suspect some of the problem stems from the fact that this data is being read in from a file. Regardless of whether Perl is treating the string as UTF-8, does it freeze when you run it? If so, this seems to be a problem--if a file ends with this byte sequence, then perl will hang....
  • Comment on Re: Re: pattern match hangs on malformed UTF-8 input

Replies are listed 'Best First'.
Re: Re: Re: pattern match hangs on malformed UTF-8 input
by diotalevi (Canon) on Apr 30, 2003 at 18:37 UTC

    Change your test case to use Devel::Peek's Dump() routine and show the results from that. We'll know what data you've actually read then. As is, your code runs without any problems.

      Here's some new code. Same thing, only Dump and print are used.
      You can change STDIN to a file handle opened on a file. As long as the last two characters are 0x227 and " ", then "hi" is never printed.
      use utf8; use Devel::Peek; while (<STDIN>) { print ">$_<\n"; Dump($_); s/\d //; print "hi\n"; }

        Are you sure this is all your code? The 'g' flag in the output indicated that $_ was currently the target of a regular expression or had study() applied to it. Its still not unicode though.

      and here's the output:
      >x < SV = PVMG(0x8101240) at 0x80f4858 REFCNT = 1 FLAGS = (SMG,POK,pPOK) IV = 0 NV = 0 PV = 0x80ff368 "x\227 "\0 CUR = 3 LEN = 80 MAGIC = 0x81429f8 MG_VIRTUAL = &PL_vtbl_mglob MG_TYPE = 'g' MG_LEN = -1
      (...and the script doesn't exit)