Um, how about encoding into ascii somehow and doing a regex on that? It should work. I've had enough wierd things crop up with binary code in the past that I'm paranoid on this issue (unless it's in C, in which case I'm both more at ease and more paranoid if that makes sense).

I've not used utf8 regex in the past but if I want to compare Japanese strings I change the character encoding into a 7-bit encoding (EUC for Japanese, which is two byte and preserves ascii codes), so maybe MIME::Base64 or some other packing method for the general case?

I believe that JPerl, a Japanized Perl, actually lets you do tr// and things with binary but this all seems pretty iffy, seems best to me to do something you know must work, though maybe not so elegant. Then test anyway. :)

By the way I don't know where your binary is coming from but watch out for endianness, which means that if you are reading a resource file from a Mac it probably stores a two or four byte value in the reverse order that a PC does. But if it is just an 8 bit ISO standard that problem shouldn't arise.


In reply to Re: How to perfrom a 'byte sequence' RE? by mattr
in thread How to perfrom a 'byte sequence' RE? by nysus

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.