in reply to Regex matching on grid alignment

> All of my other patterns work efficiently with the unmodified string (several of them can even be combined into one expression),

I'd include an end of line separator like \0 to simplify things a lot.

> so if you do suggest an alternate storage format for the string, please also consider the cost of the conversion when thinking about efficiency, here.

Yep, but w/o further knowledge of your operations we can't consider anything about efficiency.

Please help us understand why your old pattern wouldn't be able to be adjusted to a new separator.

I doubt you'll accept this answer, but it had to be given.

Cheers Rolf

( addicted to the Perl Programming Language)

Replies are listed 'Best First'.
Re^2: Regex matching on grid alignment
by Anonymous Monk on Sep 09, 2013 at 00:10 UTC

    Thank you LanX. I'll try to explain a little more. The "match 3 of the same character in a row" is the ONLY pattern I have that must not break row boundaries. The other patterns are more complex, but here are a few examples (from memroy, please forgive mistakes):

    /(.).{$width}\1.{$width}\1/; # 3 in a line vertically /^(.)(?!\1)(.)(?:\1\2)+\1?$/; # XoXoX # oXoXo # XoXoX # oXoXo

    Sorry, these are about the only two I had any hope of getting right from memory. :-) There are others that look for dynamic "walking" patterns of symbols, stairstep patterns, various symmetries, and Jimmy Hoffa (if our grant approval ever goes through). Several of them share parts with others, so they combine quite nicely into a single expression.

    I thought of using a row separator as well, but patterns not much more complex than the first get a lot less readable and slower when you start to have to have a lot of +1's and -1's to account for the different width, and exceptions so that it doesn't match three \0's in a row at the end of the lines. Patterns like the second, frankly I think it would be easier (for both man and machine) to copy the string and tr/\0//d. That wouldn't be horrible, but definitely preferable to just have another regexp to optimize with others than to have two data representations.