I've got a regex that won't match under "use utf8". It works fine without utf8, so I had a look into the "use re debug" output to understand the difference and reason. But, I don't see it---could someone please explain?

This is our Perl:

Summary of my perl5 (revision 5.0 version 6 subversion 1) configuratio +n:
Platform: osname=linux, osvers=2.4.18-3, archname=i386-linux uname='linux grcdg028 2.4.18-3 #1 thu apr 18 07:37:53 edt 2002 i68 +6 unknown ' config_args='-des -Doptimize=-O2 -march=i386 -mcpu=i686 -Dcc=gcc - +Dcf_by=Red Hat, Inc. -Dcccdlflags=-fPIC -Dinstallprefix= sr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefi +x=/usr -Uusethreads -Uuseithreads -Uuselargefiles -Dd_do id -Dd_semctl_semun -Di_db -Di_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dm +an3ext=3pm' hint=recommended, useposix=true, d_sigaction=define usethreads=undef use5005threads=undef useithreads=undef usemultipl +icity=undef useperlio=undef d_sfio=undef uselargefiles=undef usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef Compiler: cc='gcc', ccflags ='-fno-strict-aliasing -I/usr/local/include', optimize='-O2 -march=i386 -mcpu=i686', cppflags='-fno-strict-aliasing -I/usr/local/include' ccversion='', gccversion='2.96 20000731 (Red Hat Linux 7.3 2.96-11 +0)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=1 +2 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', + lseeksize=4 alignbytes=4, usemymalloc=n, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lnsl -ldl -lm -lc -lcrypt -lutil perllibs=-lnsl -ldl -lm -lc -lcrypt -lutil libc=/lib/libc-2.2.5.so, so=so, useshrplib=false, libperl=libperl. +a Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynami +c' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: Built under linux Compiled at Sep 3 2003 17:48:23 @INC: /usr/lib/perl5/5.6.1/i386-linux /usr/lib/perl5/5.6.1 /usr/lib/perl5/site_perl/5.6.1/i386-linux /usr/lib/perl5/site_perl/5.6.1 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.6.1/i386-linux /usr/lib/perl5/vendor_perl/5.6.1 /usr/lib/perl5/vendor_perl .
This is the regex and the debug output:
Compiling REx `^((?x-ism: [^\?]*? ))/((?x-ism: [^/\?]*? ))$'
size 21 first at 2 1: BOL(2) 2: OPEN1(4) 4: MINMOD(5) 5: STAR(8) 6: ANYOFUTF8[^?...003f 003f](0) 8: CLOSE1(10) 10: EXACT </>(12) 12: OPEN2(14) 14: MINMOD(15) 15: STAR(18) 16: ANYOFUTF8[^/?...002f 002f 003f 003f](0) 18: CLOSE2(20) 20: EOL(21) 21: END(0) floating `/' at 0..2147483647 (checking floating) anchored(BOL) minlen + 1 Guessing start of match, REx `^((?x-ism: [^\?]*? ))/((?x-ism: [^/\?]*? + ))$' against `pf1ad1%/pf2%2Fad2/pdad3/pfad4/filename'..Found floatin +g substr `/' at offset 7... Guessed: match at offset 0 Matching REx `^((?x-ism: [^\?]*? ))/((?x-ism: [^/\?]*? ))$' against `p +f1ad1%/pf2%2Fad2/pdad3/pfad4/filename' Setting an EVAL scope, savestack=9 0 <> <pf1ad1%/pf2%> | 1: BOL 0 <> <pf1ad1%/pf2%> | 2: OPEN1 0 <> <pf1ad1%/pf2%> | 4: MINMOD 0 <> <pf1ad1%/pf2%> | 5: STAR Setting an EVAL scope, savestack=9 0 <> <pf1ad1%/pf2%> | 8: CLOSE1 0 <> <pf1ad1%/pf2%> | 10: EXACT </> failed... ANYOFUTF8[^?...003f 003f] can match 38 tim +es out of 1... 38 <ad4/filename> <> | 8: CLOSE1 38 <ad4/filename> <> | 10: EXACT </> failed... ANYOFUTF8[^?...003f 003f] can match 0 time +s out of 1... failed...
Match failed
Thank you...

In reply to regex won't match with utf8 enabled... by Beechbone

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.