comment on

I think you'll find this will do the trick:

$rem = $W - 3;
qr/^ (?:.{$W})* .{0,$rem} (.)\1\1/x
[download]

The idea here is to minimize the branching the RE engine will have to do. The logic is pretty similar to what you might do if you had the string split into lines; just skip 0..$H¹ rows, and we know we're at the beginning of a row, so from there we just match 0..$W-3 characters followed by a repeating sequence of 3 with your original regex.

Performance is the same (a few % better actually) as the plain /(.)\1\1/, and several times faster than anything I tried with unpack or split.

Edit: You can get another ~25% or so if your character set really is small like [ABCD] by unrolling (.)\1\1 into (?:AAA|BBB|CCC|DDD). If you're not just using this as a boolean test and still need the character in $1, use (AAA|BBB|CCC|DDD) instead and use substr($1,0,1) to grab the first character if you get a match. The idea here is to push the more expensive operations out of the hot loop that's called millions of times.

___________
1. ~~$H-1, actually, but it makes no measurable difference to the efficiency, or correctness.~~ Edit: Changed {0,$H} to *, to shave a few keystrokes. Thanks LanX!

use strict; use warnings; omitted for brevity.

In reply to Re: Regex matching on grid alignment by rjt
in thread Regex matching on grid alignment by Anonymous Monk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.