comment on

Extraordinary! Even multiple float candidates in a string, the point made by hv. (/g). You've sent me off on a new tangent :-) However, I hate to admit this but I have never used bitwise operations on strings and so far, the net is not a good source of examples, and perldoc List::AllUtils is a source of frustration. I've been stepping through a subset of your code with debug to get a handle on ' |. ', which I think ORs two test strings together to get the longest for the print "%*s" width expression; 36 is definitely the longest in that array. If you would, please explain for someone naive with respect to bitwise string operations this expression?

my $leftside = length reduce {$a |. $b->[0]} @floats; # auto-adjust

I think, if the next string is longer than the previous, reduce pads the length for the next test, so the char values themselves, which are Unicode, aren't themselves relevant. When the iteration finishes $leftside contains the length of the longest Unicode string. I finally gave up noodling re what the significance of ORing two chars might be, other than non-ASCII Unicode characters can be multi-byte, and settled on the notion of building up the longest 'dingus' from the set of 'dinguses'. Is that the idea?

I would not have thought to use a bitwise operation to calculate the longest length of a set of strings though I guess that is one of the reasons why List::Util exists. My first thought was to use:

my $leftside = length reduce { length($a) >= length($b->[0]) ? $a : $b->[0] } @floats;

Or, if I got the purpose of that expression wrong, please point me to a reading assignment, other than perldoc List::AllUtils?

Thanks tybalt89 for a very interesting example.

Will

U P D A T E 10/18/2023

Thank you dasgar for insights. I used the term 'Extraordinary'; tybalt89 is brilliant, and of course, Perl is the eighth wonder of the known world. "use feature 'bitwise'" introduces |. which is useful for ORing strings, and 'bitwise' assures us that strings are treated as codepoints rather than graphenes. Why is this useful? Because length, sprintf and printf determine length attributes in codepoints, so just counting graphenes (the user visible notion of a character) can yield the wrong answer. Both length and reduce |. yield the same answer but working with bits is much faster. So, tybalt89's method of determining the longest length of the test strings in the array is the most 'efficient'. I have not benchmarked his versus mine but I have no doubt his will be an order of magnitude faster.

The really interesting example of brilliance is that regular expression. It uses a branch reset, and as you all probably know, a branch reset insures that any alternate defined within it that matches is captured to the same $n variable. There are two sets of naked parens in that regex within the alternation, and whichever matches will be saved to $1. Now here is a piece of brilliant regex coding that blows my mind. This alternation fragment:

  (?:(?:\d+\.){2,}\d+)()

is looking for invalid decimal expressions, such as IPs, as in 113.35.120.255, which are not floats but look like floats. These are to be excluded if present and because of that branch reset, the empty () saves nothing to $1. That effectively is the logical equivalent of a negative look-ahead, but certainly is faster and more efficient than using (?!...

I posted this question in hopes of generating discussion and I got more than I bargained for; an elegant lesson in Perl magic from a master. Thank you tybalt89 for a welcome dose of enlightenment.

Will

In reply to Re^2: Best practice validating numerics with regex? by perlboy_emeritus
in thread Best practice validating numerics with regex? by perlboy_emeritus

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.