in reply to Extend regex legibility within character classes

Why not interpolate the stuff into a variable? It can sometimes be a useful technique, but to my eye the following is more readable and obvious to people who follow.

my $valid_XML_BaseChars = join('', "\x{0041}-\x{005A}", # Uppercase A-Z "\x{0100}-\x{0131}", # Extended Latin A subset # Skipping ligatures 0132, 0133 "\x{0134}-\x{013E}", # Continuing Ext. Latin A # Skipping middle dots 013F, 0140 "\x{0141}-\x{0148}", # Finishing Ext. Latin A "\x{01FA}-\x{0217}", # Extended Latin B subset "\x{0250}-\x{02A8}", # IPA Extensions ); my $XML_BaseChar= qr/[$valid_xml_basechar]/o;

-ben

Replies are listed 'Best First'.
Re: Re: Extend regex legibility within character classes
by John M. Dlugosz (Monsignor) on Jun 11, 2001 at 22:06 UTC
    That's what I started with, but wanted to get rid of the extra named variable. Using the @{[]} trick lets me write it in one statement.

    The extra variable would be less objectionable if I could hide the scope, but if the real one I'm declaring is also a "my", I can't put braces around the whole thing. It would take a third line, outside of the braces, first to declare it.

    —John

      my $XML_BaseChar= qr/$_/ for join "", "[", # a character class... "\x{0041}-\x{005A}", # the first range "\x{0100}-\x{0131}\x{0134}-\x{013E}\x{0141}-\x{0148}", # the next r +ange "\x{01FA}-\x{0217}\x{0250}-\x{02A8}", # etc. "]"; #here's how to close it up
              - tye (but my friends call me "Tye")