We may have it something. Lots of output. I'll start with an example that does NOT hang (because I added an 'x'):
$printf "x\0227x " | tmp.pl

Compiling REx `^&'
size 4 first at 2
   1: BOL(2)
   2: EXACT <&>(4)
   4: END(0)
anchored `&' at 0 (checking anchored) anchored(BOL) minlen 1 
Compiling REx `\W'
size 2 first at 1
   1: NALNUM(2)
   2: END(0)
stclass `NALNUM' minlen 1 
Using REx substr: `::'
Guessing start of match, REx `\\/^\\/+$' against `/usr/local/lib/perl5/5.6.1//i686-linux/Devel/Peek.pm'...
Found floating substr `'$ at offset 52...
Does not contradict STCLASS...
Guessed: match at offset 0
Matching REx `\\/^\\/+$' against `/usr/local/lib/perl5/5.6.1//i686-linux/Devel/Peek.pm'
  Setting an EVAL scope, savestack=307
   0 <> </usr/local/l>    |  1:  ANYOF/\\
   1 </> <usr/local/l>    | 10:  PLUS
                           ANYOF[\0-.0-\-\377] can match 3 times out of 32767...
  Setting an EVAL scope, savestack=307
   4 </usr> </local/l>    | 20:    EOL
                              failed...
                            failed...
  Setting an EVAL scope, savestack=307
   4 </usr> </local/l>    |  1:  ANYOF/\\
   5 </usr/> <local/l>    | 10:  PLUS
                           ANYOF[\0-.0-\-\377] can match 5 times out of 32767...
  Setting an EVAL scope, savestack=307
  10 <local> </lib/pe>    | 20:    EOL
                              failed...
                            failed...
  Setting an EVAL scope, savestack=307
  10 <local> </lib/pe>    |  1:  ANYOF/\\
  11 <ocal/> <lib/per>    | 10:  PLUS
                           ANYOF[\0-.0-\-\377] can match 3 times out of 32767...
  Setting an EVAL scope, savestack=307
  14 <l/lib> </perl5/>    | 20:    EOL
                              failed...
                            failed...
  Setting an EVAL scope, savestack=307
  14 <l/lib> </perl5/>    |  1:  ANYOF/\\
  15 </lib/> <perl5/5>    | 10:  PLUS
                           ANYOF[\0-.0-\-\377] can match 5 times out of 32767...
  Setting an EVAL scope, savestack=307
  20 <perl5> </5.6.1/>    | 20:    EOL
                              failed...
                            failed...
  Setting an EVAL scope, savestack=307
  20 <perl5> </5.6.1/>    |  1:  ANYOF/\\
  21 <erl5/> <5.6.1//>    | 10:  PLUS
                           ANYOF[\0-.0-\-\377] can match 5 times out of 32767...
  Setting an EVAL scope, savestack=307
  26 <5.6.1> <//i686->    | 20:    EOL
                              failed...
                            failed...
  Setting an EVAL scope, savestack=307
  26 <5.6.1> <//i686->    |  1:  ANYOF/\\
  27 <.6.1/> </i686-l>    | 10:  PLUS
                           ANYOF[\0-.0-\-\377] can match 0 times out of 32767...
  Setting an EVAL scope, savestack=307
                            failed...
  Setting an EVAL scope, savestack=307
  27 <.6.1/> </i686-l>    |  1:  ANYOF/\\
  28 <6.1//> <i686-li>    | 10:  PLUS
                           ANYOF[\0-.0-\-\377] can match 10 times out of 32767...
  Setting an EVAL scope, savestack=307
  38 <linux> </Devel/>    | 20:    EOL
                              failed...
                            failed...
  Setting an EVAL scope, savestack=307
  38 <linux> </Devel/>    |  1:  ANYOF/\\
  39 <inux/> <Devel/P>    | 10:  PLUS
                           ANYOF[\0-.0-\-\377] can match 5 times out of 32767...
  Setting an EVAL scope, savestack=307
  44 <Devel> </Peek.p>    | 20:    EOL
                              failed...
                            failed...
  Setting an EVAL scope, savestack=307
  44 <Devel> </Peek.p>    |  1:  ANYOF/\\
  45 <evel/> <Peek.pm>    | 10:  PLUS
                           ANYOF[\0-.0-\-\377] can match 7 times out of 32767...
  Setting an EVAL scope, savestack=307
  52 <evel/Peek.pm> <>    | 20:    EOL
  52 <evel/Peek.pm> <>    | 21:    END
Match successful!
Guessing start of match, REx `\\/^\\/+$' against `/usr/local/lib/perl5/5.6.1//i686-linux/Devel'...
Found floating substr `'$ at offset 44...
Does not contradict STCLASS...
Guessed: match at offset 0
Matching REx `\\/^\\/+$' against `/usr/local/lib/perl5/5.6.1//i686-linux/Devel'
  Setting an EVAL scope, savestack=304
   0 <> </usr/local/l>    |  1:  ANYOF/\\
   1 </> <usr/local/l>    | 10:  PLUS
                           ANYOF[\0-.0-\-\377] can match 3 times out of 32767...
  Setting an EVAL scope, savestack=304
   4 </usr> </local/l>    | 20:    EOL
                              failed...
                            failed...
  Setting an EVAL scope, savestack=304
   4 </usr> </local/l>    |  1:  ANYOF/\\
   5 </usr/> <local/l>    | 10:  PLUS
                           ANYOF[\0-.0-\-\377] can match 5 times out of 32767...
  Setting an EVAL scope, savestack=304
  10 <local> </lib/pe>    | 20:    EOL
                              failed...
                            failed...
  Setting an EVAL scope, savestack=304
  10 <local> </lib/pe>    |  1:  ANYOF/\\
  11 <ocal/> <lib/per>    | 10:  PLUS
                           ANYOF[\0-.0-\-\377] can match 3 times out of 32767...
  Setting an EVAL scope, savestack=304
  14 <l/lib> </perl5/>    | 20:    EOL
                              failed...
                            failed...
  Setting an EVAL scope, savestack=304
  14 <l/lib> </perl5/>    |  1:  ANYOF/\\
  15 </lib/> <perl5/5>    | 10:  PLUS
                           ANYOF[\0-.0-\-\377] can match 5 times out of 32767...
  Setting an EVAL scope, savestack=304
  20 <perl5> </5.6.1/>    | 20:    EOL
                              failed...
                            failed...
  Setting an EVAL scope, savestack=304
  20 <perl5> </5.6.1/>    |  1:  ANYOF/\\
  21 <erl5/> <5.6.1//>    | 10:  PLUS
                           ANYOF[\0-.0-\-\377] can match 5 times out of 32767...
  Setting an EVAL scope, savestack=304
  26 <5.6.1> <//i686->    | 20:    EOL
                              failed...
                            failed...
  Setting an EVAL scope, savestack=304
  26 <5.6.1> <//i686->    |  1:  ANYOF/\\
  27 <.6.1/> </i686-l>    | 10:  PLUS
                           ANYOF[\0-.0-\-\377] can match 0 times out of 32767...
  Setting an EVAL scope, savestack=304
                            failed...
  Setting an EVAL scope, savestack=304
  27 <.6.1/> </i686-l>    |  1:  ANYOF/\\
  28 <6.1//> <i686-li>    | 10:  PLUS
                           ANYOF[\0-.0-\-\377] can match 10 times out of 32767...
  Setting an EVAL scope, savestack=304
  38 <-linux> </Devel>    | 20:    EOL
                              failed...
                            failed...
  Setting an EVAL scope, savestack=304
  38 <-linux> </Devel>    |  1:  ANYOF/\\
  39 <-linux/> <Devel>    | 10:  PLUS
                           ANYOF[\0-.0-\-\377] can match 5 times out of 32767...
  Setting an EVAL scope, savestack=304
  44 <-linux/Devel> <>    | 20:    EOL
  44 <-linux/Devel> <>    | 21:    END
Match successful!
Guessing start of match, REx `(\.\w+)?(;\d*)?$' against `/usr/local/lib/perl5/5.6.1//i686-linux/auto/Devel/Peek/Peek....'...
Found floating substr `'$ at offset 62...
Guessed: match at offset 0
Matching REx `(\.\w+)?(;\d*)?$' against `/usr/local/lib/perl5/5.6.1//i686-linux/auto/Devel/Peek/Peek....'
  Setting an EVAL scope, savestack=308
   0 <> </usr/local/l>    |  1:  CURLYX[0] {0,1}
   0 <> </usr/local/l>    | 11:    WHILEM
                              0 out of 0..1  cc=bfffeab0
  Setting an EVAL scope, savestack=313
   0 <> </usr/local/l>    |  3:      OPEN1
   0 <> </usr/local/l>    |  5:      EXACT <.>
                                failed...
     restoring \1..\2 to undef
                              failed, try continuation...
   0 <> </usr/local/l>    | 12:      NOTHING
   0 <> </usr/local/l>    | 13:      CURLYX1 {0,1}
   0 <> </usr/local/l>    | 23:        WHILEM
                                  0 out of 0..1  cc=bfffe690
  Setting an EVAL scope, savestack=313
   0 <> </usr/local/l>    | 15:          OPEN2
   0 <> </usr/local/l>    | 17:          EXACT <;>
                                    failed...
     restoring \1..\2 to undef
                                  failed, try continuation...
   0 <> </usr/local/l>    | 24:          NOTHING
   0 <> </usr/local/l>    | 25:          EOL
                                    failed...
                                  failed...
                                failed...
                              failed...
                            failed...
(Snipped lots of similar looking stuff)
  Setting an EVAL scope, savestack=313
  58 <Peek/Pee> <k.so>    | 15:          OPEN2
  58 <Peek/Pee> <k.so>    | 17:          EXACT <;>
                                    failed...
     restoring \1..\2 to undef
                                  failed, try continuation...
  58 <Peek/Pee> <k.so>    | 24:          NOTHING
  58 <Peek/Pee> <k.so>    | 25:          EOL
                                    failed...
                                  failed...
                                failed...
                              failed...
                            failed...
  Setting an EVAL scope, savestack=308
  59 <Peek/Peek> <.so>    |  1:  CURLYX[0] {0,1}
  59 <Peek/Peek> <.so>    | 11:    WHILEM
                              0 out of 0..1  cc=bfffeab0
  Setting an EVAL scope, savestack=313
  59 <Peek/Peek> <.so>    |  3:      OPEN1
  59 <Peek/Peek> <.so>    |  5:      EXACT <.>
  60 <Peek/Peek.> <so>    |  7:      PLUS
                           ALNUM can match 2 times out of 32767...
  Setting an EVAL scope, savestack=313
  62 <Peek/Peek.so> <>    |  9:        CLOSE1
  62 <Peek/Peek.so> <>    | 11:        WHILEM
                                  1 out of 0..1  cc=bfffeab0
  62 <Peek/Peek.so> <>    | 12:          NOTHING
  62 <Peek/Peek.so> <>    | 13:          CURLYX1 {0,1}
  62 <Peek/Peek.so> <>    | 23:            WHILEM
                                      0 out of 0..1  cc=bfffe270
  Setting an EVAL scope, savestack=318
  62 <Peek/Peek.so> <>    | 15:              OPEN2
  62 <Peek/Peek.so> <>    | 17:              EXACT <;>
                                        failed...
     restoring \2..\2 to undef
                                      failed, try continuation...
  62 <Peek/Peek.so> <>    | 24:              NOTHING
  62 <Peek/Peek.so> <>    | 25:              EOL
  62 <Peek/Peek.so> <>    | 26:              END
Match successful!
Matching REx `\W' against `boot_Devel::Peek'
  Setting an EVAL scope, savestack=310
  10 <_Devel> <::Peek>    |  1:  NALNUM
  11 <_Devel:> <:Peek>    |  2:  END
Match successful!
Matching REx `\W' against `:Peek'
  Setting an EVAL scope, savestack=310
  11 <_Devel_> <:Peek>    |  1:  NALNUM
  12 <_Devel_:> <Peek>    |  2:  END
Match successful!
Matching REx `\W' against `Peek'
Contradicts stclass...
Match failed
Matching REx `\W' against `Dump'
Contradicts stclass...
Match failed
Matching REx `\W' against `mstat'
Contradicts stclass...
Match failed
Matching REx `\W' against `DeadCode'
Contradicts stclass...
Match failed
Matching REx `\W' against `DumpArray'
Contradicts stclass...
Match failed
Matching REx `\W' against `DumpWithOP'
Contradicts stclass...
Match failed
Matching REx `\W' against `DumpProg'
Contradicts stclass...
Match failed
Matching REx `\W' against `fill_mstats'
Contradicts stclass...
Match failed
Matching REx `\W' against `mstats_fillhash'
Contradicts stclass...
Match failed
Matching REx `\W' against `mstats2hash'
Contradicts stclass...
Match failed
Compiling REx `\d '
Compiling REx `::'
size 3 first at 1
   1: EXACT <::>(3)
   3: END(0)
anchored `::' at 0 (checking anchored isall) minlen 2 
Compiling REx `^(Isn|To)(A-Z.*)'
size 36 first at 2
   1: BOL(2)
   2: OPEN1(4)
   4:   BRANCH(16)
   5:     EXACT (7)
   7:     ANYOFns(19)
  16:   BRANCH(19)
  17:     EXACT <To>(19)
  19: CLOSE1(21)
  21: OPEN2(23)
  23:   ANYOFA-Z(32)
  32:   STAR(34)
  33:     REG_ANY(0)
  34: CLOSE2(36)
  36: END(0)
anchored(BOL) minlen 3 
Compiling REx `^'
size 2 first at 2
   1: MBOL(2)
   2: END(0)
stclass `END' anchored(MBOL) minlen 0 
Matching REx `\W' against `confess'
Contradicts stclass...
Match failed
Matching REx `\W' against `croak'
Contradicts stclass...
Match failed
Matching REx `\W' against `carp'
Contradicts stclass...
Match failed
Compiling REx `^(^=+)='
size 18 first at 2
synthetic stclass `ANYOF\0-<>-\377'.
   1: BOL(2)
   2: OPEN1(4)
   4:   PLUS(14)
   5:     ANYOF\0-<>-\377(0)
  14: CLOSE1(16)
  16: EXACT <=>(18)
  18: END(0)
floating `=' at 1..2147483647 (checking floating) stclass `ANYOF\0-<>-\377' anchored(BOL) minlen 2 
Compiling REx `^^0-9a-fA-F'
size 11 first at 2
   1: BOL(2)
   2: ANYOF\0-/:-@G-`g-\377(11)
  11: END(0)
stclass `ANYOF\0-/:-@G-`g-\377' anchored(BOL) minlen 1 
Compiling REx `^(0-9a-fA-F+)'
size 16 first at 2
synthetic stclass `ANYOF0-9A-Fa-f'.
   1: BOL(2)
   2: OPEN1(4)
   4:   PLUS(14)
   5:     ANYOF0-9A-Fa-f(0)
  14: CLOSE1(16)
  16: END(0)
stclass `ANYOF0-9A-Fa-f' anchored(BOL) minlen 1 
Compiling REx `\tXXXX$'
size 5 first at 1
   1: EXACT <	XXXX>(4)
   4: MEOL(5)
   5: END(0)
anchored `	XXXX'$ at 0 (checking anchored isall) minlen 5 
Compiling REx `^(0-9a-fA-F+)(?:\t(0-9a-fA-F+)?)(?:\t(0-9a-fA-F+))?'
size 56 first at 2
synthetic stclass `ANYOF0-9A-Fa-f'.
   1: MBOL(2)
   2: OPEN1(4)
   4:   PLUS(14)
   5:     ANYOF0-9A-Fa-f(0)
  14: CLOSE1(16)
  16: EXACT <	>(18)
  18: CURLYX1 {0,1}(35)
  20:   OPEN2(22)
  22:     PLUS(32)
  23:       ANYOF0-9A-Fa-f(0)
  32:   CLOSE2(34)
  34:   WHILEM(0)
  35: NOTHING(36)
  36: CURLYX2 {0,1}(55)
  38:   EXACT <	>(40)
  40:   OPEN3(42)
  42:     PLUS(52)
  43:       ANYOF0-9A-Fa-f(0)
  52:   CLOSE3(54)
  54:   WHILEM(0)
  55: NOTHING(56)
  56: END(0)
floating `	' at 1..2147483647 (checking floating) stclass `ANYOF0-9A-Fa-f' anchored(MBOL) minlen 2 
Compiling REx `^(^0-9a-fA-F\n)(.*)'
size 21 first at 2
synthetic stclass `ANYOF\0-\11\13-/:-@G-`g-\377'.
   1: MBOL(2)
   2: OPEN1(4)
   4:   ANYOF\0-\11\13-/:-@G-`g-\377(13)
  13: CLOSE1(15)
  15: OPEN2(17)
  17:   STAR(19)
  18:     REG_ANY(0)
  19: CLOSE2(21)
  21: END(0)
stclass `ANYOF\0-\11\13-/:-@G-`g-\377' anchored(MBOL) minlen 1 
Compiling REx `-+!'
size 10 first at 1
   1: ANYOF!+\-(10)
  10: END(0)
stclass `ANYOF!+\-' minlen 1 
Compiling REx `::'
size 3 first at 1
   1: EXACT <::>(3)
   3: END(0)
anchored `::' at 0 (checking anchored isall) minlen 2 
Compiling REx `^(0-9a-fA-F+)(?:\t(0-9a-fA-F+)?)(?:\t(0-9a-fA-F+))?'
size 56 first at 2
synthetic stclass `ANYOF0-9A-Fa-f'.
   1: MBOL(2)
   2: OPEN1(4)
   4:   PLUS(14)
   5:     ANYOF0-9A-Fa-f(0)
  14: CLOSE1(16)
  16: EXACT <	>(18)
  18: CURLYX1 {0,1}(35)
  20:   OPEN2(22)
  22:     PLUS(32)
  23:       ANYOF0-9A-Fa-f(0)
  32:   CLOSE2(34)
  34:   WHILEM(0)
  35: NOTHING(36)
  36: CURLYX2 {0,1}(55)
  38:   EXACT <	>(40)
  40:   OPEN3(42)
  42:     PLUS(52)
  43:       ANYOF0-9A-Fa-f(0)
  52:   CLOSE3(54)
  54:   WHILEM(0)
  55: NOTHING(56)
  56: END(0)
floating `	' at 1..2147483647 (checking floating) stclass `ANYOF0-9A-Fa-f' anchored(MBOL) minlen 2 
Compiling REx `^(0-9a-fA-F+)(?:\t(0-9a-fA-F+))?'
size 36 first at 2
synthetic stclass `ANYOF0-9A-Fa-f'.
   1: MBOL(2)
   2: OPEN1(4)
   4:   PLUS(14)
   5:     ANYOF0-9A-Fa-f(0)
  14: CLOSE1(16)
  16: CURLYX1 {0,1}(35)
  18:   EXACT <	>(20)
  20:   OPEN2(22)
  22:     PLUS(32)
  23:       ANYOF0-9A-Fa-f(0)
  32:   CLOSE2(34)
  34:   WHILEM(0)
  35: NOTHING(36)
  36: END(0)
stclass `ANYOF0-9A-Fa-f' anchored(MBOL) minlen 1 
Compiling REx `^(-+!)(.*)'
size 21 first at 2
synthetic stclass `ANYOF!+\-'.
   1: MBOL(2)
   2: OPEN1(4)
   4:   ANYOF!+\-(13)
  13: CLOSE1(15)
  15: OPEN2(17)
  17:   STAR(19)
  18:     REG_ANY(0)
  19: CLOSE2(21)
  21: END(0)
stclass `ANYOF!+\-' anchored(MBOL) minlen 1 
size 4 first at 1
   1: DIGITUTF8(2)
   2: EXACT < >(4)
   4: END(0)
anchored ` ' at 1 (checking anchored) stclass `DIGITUTF8' minlen 2 
>x—x <
SV = PV(0x80f4b84) at 0x80f4858
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x811d1e0 "x\227x "\0
  CUR = 4
  LEN = 80
Guessing start of match, REx `\d ' against `x—x '...
Found anchored substr ` ' at offset 3...
Starting position does not contradict /^/m...
This position contradicts STCLASS...
Looking for anchored substr starting at offset 4...
Did not find anchored substr ` '...
Match rejected by optimizer
hi
Freeing REx: `\d '

In reply to Re: pattern match hangs on malformed UTF-8 input by Anonymous Monk
in thread pattern match hangs on malformed UTF-8 input by y9o

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.