in reply to Re: Simple pattern match failing - Possibly unicode issue
in thread Simple pattern match failing - Possibly unicode issue
/\d{2}/ will always match two characters exactly. But doesn't perl produce polymorphic opcodes for pattern matching which does different things based on the input string encoding? In case of multi-byte encoding, what is /.{2}/ supposed to match? 2 bytes or 2 characters in the given encoding?
And yes, the code having the problem isn't the code I posted. But I can assure you the code having the problem is doing the same thing. The actual code reads the value of $datetime from a unicode encoded XML file, reads the pattern to match from a config file and populates the fields accordingly. I am dumping both $datetime and $regex before doing a pattern match and they are exactly what I have shown here.
I have anecdotal evidence that perl's unicode implementation have a role to play in this. I removed:
directive and now it works as it's supposed to be.use utf8;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Simple pattern match failing - Possibly unicode issue
by JavaFan (Canon) on Jun 04, 2010 at 15:38 UTC | |
|
Re^3: Simple pattern match failing - Possibly unicode issue
by ikegami (Patriarch) on Jun 04, 2010 at 17:29 UTC | |
|
Re^3: Simple pattern match failing - Possibly unicode issue
by Anonymous Monk on Jun 04, 2010 at 14:29 UTC | |
|
Re^3: Simple pattern match failing - Possibly unicode issue
by proceng (Scribe) on Jun 04, 2010 at 23:33 UTC |