Syntax Perl Version support $c = () = $a =~ /\./g

h2 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Syntax Perl Version support $c = () = $a =~ /\./g (updated) by AnomalousMonk (Archbishop) on Jul 17, 2018 at 18:34 UTC
`my $c = () = $a =~ /\./g;` This statement evaluates a regex in list context (imposed by the empty parens): `() = $a =~ /\./g` and then evaluates the contents of the intermediate list captured by the parens in scalar context (imposed by the scalar assignment): `my $c = ()` which evaluates to the number of elements in the list. Some variants of this syntax (in which the regex matches are captured instead of being thrown away): `c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "print 'perl version: ', $]; ;; my $s = '3.5.9'; ;; my @ra; my $c = @ra = $s =~ m{ \d }xmsg; dd $c, \@ra; ;; $s = '8.6.5.1'; ;; $c = my @rb = $s =~ m{ \d }xmsg; dd $c, \@rb; " perl version: 5.008009 (3, [3, 5, 9]) (4, [8, 6, 5, 1])` [download] ... when did Perl begin supporting this type of structure? ~~AFAIK, Perl 5.x has always supported this, and I believe 4.x did as well, but I'm too lazy to do the research.~~ (Update: See this for pertinent info on Perl 5 support for the `=()=` "operator," and this for Perl 4 support (none).) This is a well-known and safe Perl idiom. In general, look for discussions of "context" and "context dependence." Update: See also Context tutorial in the Tutorials section of the Monastery. Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^2: Syntax Perl Version support $c = () = $a =~ /\./g by haukex (Archbishop) on Jul 17, 2018 at 19:13 UTC
AFAIK, Perl 5.x has always supported this I'm having some trouble running a bisect at the moment, but as a tentative result, it appears that Perl versions roughly before 5.004 didn't like the test program `-e '$a="f.o.o.bar"; $b=()=$a=~/\./g; $b==3 or die $b'` because `Can't modify stub in list assignment at -e line 1, near "/\./g;"`. AFAICT it was not possible to assign to an empty list back then (a workaround seems to be `$b = @{[]} = $a =~ /\./g;`, which works on 5.000).	[reply] [d/l] [select]
Re^3: Syntax Perl Version support $c = () = $a =~ /\./g by h2 (Beadle) on Jul 17, 2018 at 19:16 UTC
haukex, thanks for digging in to find the lower limit, that was a key thing I needed to know. I'll note this in the docs, but I doubt I will ever need to go older than Perl 5.008 but I have learned painfully to never say never in this area.	[reply]
Re^2: Syntax Perl Version support $c = () = $a =~ /\./g by shmem (Chancellor) on Jul 17, 2018 at 22:47 UTC
and I believe 4.x did as well No, it didn't and doesn't yet without patches ;-) perl 4 patchlevel 36: `qwurx [shmem] ~> perl4 -e '$s="3.4.5";$r=()=$s=~/\./g;print$r,"\n"' Illegal item (LEXPR) as lvalue in file /tmp/perl-eEdymqE at line 1, ne +xt 2 tokens "/\./g;" Execution of /tmp/perl-eEdymqE aborted due to compilation errors.` [download] However, chaining assignments and evaluating ARRAY in scalar context did: `qwurx [shmem] ~> perl4 -e '$s="3.4.5";$r=@r=$s=~/\./g;print$r,"\n"' 2` [download] Applying semantics of ARRAY to LEXPR (list expression) happened in perl5. perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'	[reply] [d/l] [select]
Re: thanks all by h2 (Beadle) on Jul 17, 2018 at 19:14 UTC
Thank you very much, I'm glad to see I can start digging into these more arcane operator collections. Thanks for the links as well, now I at least know what to be looking for. Your responses are exactly what I needed to know.	[reply]
Re: Syntax Perl Version support $c = () = $a =~ /\./g by jdporter (Paladin) on Jul 17, 2018 at 18:41 UTC
Read: perlsecret - Perl secret operators and constants (Hint: It's the one right after the "space station" operator)	[reply]
Re^2: Syntax Perl Version support $c = () = $a =~ /\./g by AnomalousMonk (Archbishop) on Jul 17, 2018 at 19:04 UTC
It's also called the "goatse" operator — just don't ask why! Give a man a fish: `<%-{-{-{-<`	[reply] [d/l]
Re^3: Syntax Perl Version support $c = () = $a =~ /\./g by Marshall (Canon) on Jul 18, 2018 at 00:25 UTC
I think this is supposed to look like an ASCII emoticon for a type of beard called a goatee. But yeah, that is what this is called.	[reply]
Re^4: Syntax Perl Version support $c = () = $a =~ /\./g by Your Mother (Archbishop) on Jul 18, 2018 at 14:33 UTC
Re^5: Syntax Perl Version support $c = () = $a =~ /\./g by Marshall (Canon) on Jul 24, 2018 at 17:07 UTC
Some notes below your chosen depth have not been shown here
Re^4: Syntax Perl Version support $c = () = $a =~ /\./g by AnomalousMonk (Archbishop) on Jul 18, 2018 at 04:49 UTC
Re^4: Syntax Perl Version support $c = () = $a =~ /\./g by tobyink (Canon) on Jul 18, 2018 at 08:20 UTC
Re: Syntax Perl Version support $c = () = $a =~ /\./g by jwkrahn (Abbot) on Jul 17, 2018 at 21:28 UTC
`$c = () = $a =~ /\./g` [download] That counts the number of the characters '.' in the $a string. A more efficient way to do that is: `$c = $a =~ tr/.//` [download] But note that while the first verion will work with strings of any length the second verion will only work with characters.	[reply] [d/l] [select]
Re^2: Syntax Perl Version support $c = () = $a =~ /\./g by eyepopslikeamosquito (Archbishop) on Jul 17, 2018 at 22:40 UTC
See also: perlfaq4 (How can I count the number of occurrences of a substring within a string?) Count things in a string (Effective Perl)	[reply]
Re^2: Syntax Perl Version support $c = () = $a =~ /\./g by h2 (Beadle) on Jul 17, 2018 at 22:02 UTC
jwkrahn, I had to test this, and lo, as you said, tr/.// is about 80% or so faster. I mean, of course, I'd have to run it thousands of times, but there it is, much faster. Thanks for that tip. I'm not clear on what you meant by this: But note that while the first version will work with strings of any length the second version will only work with characters. The data is a string of varying lengths, and usually is a decimal number.	[reply]
Re^3: Syntax Perl Version support $c = () = $a =~ /\./g by mr_mischief (Monsignor) on Jul 17, 2018 at 23:12 UTC
`tr///` (sometimes spelled `y///`, especially in code golf) works on characters specifically. `m//` (sometimes spelled `//`) works on regular expression matches, which may concern one or more characters (or in special cases zero, such as `split //, $foo;`). Perl takes text very seriously. There is a load to know about processing text in Perl, but the basics are pretty quick to grasp. The full story is not complete without at least these manual pages, although for this specific topic the first few should suffice.: m// s/// tr/// split index substr length lc uc lcfirst ucfirst reverse chr ord pack unpack perlop (especially eq ne gt lt ge le cmp . x =~ .. ) perlrequick perlretut perlreref perlre perlfaq6 perlrebackslash perlrecharclass perluniintro perlunitut perlunifaq perlunicode perluniprops perlvar (especially $_ $1 $a $b $\| $" $` $& $' $, $. $/ and $\ but especially especially $_ and $1 perllocale You might hope you never need to read perlebcdic, but there's that too.	[reply] [d/l] [select]
Re^4: Syntax Perl Version support $c = () = $a =~ /\./g by morgon (Priest) on Jul 18, 2018 at 00:09 UTC
Re^5: Syntax Perl Version support $c = () = $a =~ /\./g by choroba (Cardinal) on Jul 18, 2018 at 00:13 UTC
4x faster now! by h2 (Beadle) on Jul 18, 2018 at 19:04 UTC
While testing this again with tr/ in mind, I decided to see how much faster it would be if I got rid of all the regex in the tests, and replaced them with tr/ with the numeric values and count, and ended up with a 4x (!) speed improvement over the pure regex sequence of tests. I also discovered that there is no obvious difference between `(my $c = $_[0] =~ tr/.// ) <= 1 ## and ($_[0] =~ tr/.// ) <= 1` [download] which I assume means I stumbled onto another secret operator (the wrapping parentheses), which suggests to me that I should study these more to become more aware of this area of Perl.	[reply] [d/l]
Re: 4x faster now! (updated) by AnomalousMonk (Archbishop) on Jul 18, 2018 at 19:26 UTC
... another secret operator (the wrapping parentheses) ... You have stumbled upon the secret operator known as operator precedence. In this case, there's no need for the disambiguation of grouping parentheses because the `=~` binding operator is of higher precedence than `<=` or any other comparator. `c:\@Work\Perl\monks>perl -wMstrict -le "sub not_too_many_dots { return $_[0] =~ tr/.// <= 1; } ;; for my $s ('', qw(. .. ... ....)) { printf qq{'$s' %stoo many dots \n}, not_too_many_dots($s) ? 'NOT ' : '' ; } " '' NOT too many dots '.' NOT too many dots '..' too many dots '...' too many dots '....' too many dots` [download] See perlop (update: in particular Operator Precedence and Associativity). Of course, parenthetic grouping disambiguation never hurts, and many recommend it as a general BP to support readability/maintainability. Update 1: I suspect the speedup you're seeing is due to operating directly upon an element of the aliased `@_` subroutine argument array rather than burning the computrons needed to create lexical variables. See perlsub. (This would be in addition to using `tr///` rather than `m//` for counting individual characters.) Update 2: If you want to know what Perl thinks about the precedence and associativity of the operators you're using, use the O and B::Deparse modules. The `-p` flag produces full, explcit parenthetic grouping. (The useless assignments just produce some more grouping examples.) c:\@Work\Perl\monks>perl -wMstrict -MO=Deparse,-p -le "sub not_too_many_dots { return $_[0] =~ tr/.// <= 1; } ;; for my $s ('', qw(. .. ... ....)) { my $g = my $f = not_too_many_dots($s); printf qq{'$s' %stoo many dots \n}, $f ? 'NOT ' : ''; print $g; } " BEGIN { $^W = 1; } BEGIN { $/ = "\n"; $\ = "\n"; } sub not_too_many_dots { use strict 'refs'; return((($_[0] =~ tr/.//) <= 1)); } use strict 'refs'; foreach my($s) (('', ('.', '..', '...', '....'))) { (my $g = (my $f = not_too_many_dots($s))); printf("'${s}' %stoo many dots \n", ($f ? 'NOT ' : '')); print($g); } -e syntax OK [download] Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^2: 4x faster now! (updated) by h2 (Beadle) on Jul 19, 2018 at 20:29 UTC
Re^3: 4x faster now! (updated) by AnomalousMonk (Archbishop) on Jul 19, 2018 at 21:50 UTC
Re: 4x faster now! by kcott (Archbishop) on Jul 19, 2018 at 12:36 UTC
G'day h2, "... and ended up with a 4x (!) speed improvement ..." You might like to take a look at "perlperf - Perl Performance and Optimization Techniques" which discusses this, amongst other things, and includes benchmarks. While `y///` and `s///` are not always interchangeable, when they can provide the same functionality, I've generally found `y///` to be measurably faster than `s///`. — Ken	[reply] [d/l] [select]
Re^2: 4x faster now! by h2 (Beadle) on Jul 19, 2018 at 20:37 UTC
Re^3: 4x faster now! by kcott (Archbishop) on Jul 20, 2018 at 10:07 UTC
Re: Syntax Perl Version support $c = () = $a =~ /\./g by Marshall (Canon) on Jul 18, 2018 at 01:12 UTC
That is a rather odd statement, but completely valid since at least Perl 5.6 "mid-90's". `my @anyname = some match regex global expression; my $c = @anyname; #scalar value of number of elements in @anyname # The Goatse means the same thing but you don't have to have to # Updated spelling of Goatse, one "e", not two - Ooops thanks YourMoth +er # have the array @anyname` [download] In general my code usually processes each match from a "match global". I think I've used this construct before, but this is definitely not common. More common is whether or not a match exists - not how many.	[reply] [d/l]
Re: Syntax Perl Version support $c = () = $a =~ /\./g by ikegami (Patriarch) on Jul 18, 2018 at 09:32 UTC
At least since 5.6. See Mini-Tutorial: Scalar vs List Assignment Operator and more specifically Perl Idioms Explained - my $count = () = /.../g.	[reply]
Re: Syntax Perl Version support $c = () = $a =~ /\./g by Anonymous Monk on Jul 17, 2018 at 19:42 UTC
And wouldn't it be the perfect thing to put in a descriptively-named one line subroutine that hides the Perl-voodoo in only one place, and tells everyone else exactly what it's supposed to do? Mmmmm....? Perl has lots of ways to write something such that you don't know at a glance what it's supposed to do, and it doesn't actually do what you intended for it to do or thought that it did, either. Sure would be nice to be able to fix any problems in one place.	[reply]
Re^2: Syntax Perl Version support $c = () = $a =~ /\./g by Your Mother (Archbishop) on Jul 17, 2018 at 20:50 UTC
Perl has lots of ways to write something such that you don't know at a glance what it's supposed to do Well… Perl programmers do. It's weird… it's like this site is almost dedicated to those people. Maybe a name change would help combat the lack of clarity perptually befuddling you. `s/Perl Monks/Perl Programmers Who Like Perl and Also Enjoy Learning, Exploring Problems, and Helping Their Community; Unemployable PMs Are Welcome to Participate Providing They Remain Respectful and Don't Become Founts of Acrimony or Wasted Time for Years on End/g;`	[reply] [d/l]
Re^2: Syntax Perl Version support $c = () = $a =~ /\./g by h2 (Beadle) on Jul 17, 2018 at 20:03 UTC
Anonymous monk, and that is precisely where that is being placed, inside of a one liner utility subroutine, that is documented as to what it tests, in this case, it returns 1, true, if the match is correct, and undefined, false, if not. As part of a larger test, but as a one liner. I have used arcane syntax like this internally in utility tools, and try to keep it out of the main logic that would be likely to get patches or pull requests. Documented, with comments, etc. Now that this is confirmed safe, I'll be extending its use for all numeric tests (which is what this particular case is doing), and look into more secret type constructions for various utilities, that are not meant to be end user readable or serviceable, but are meant to be very very fast.	[reply]
Re^3: Syntax Perl Version support $c = () = $a =~ /\./g by eyepopslikeamosquito (Archbishop) on Jul 17, 2018 at 22:20 UTC
Note that the content of the "anonymous" reply you replied to indicates it was authored by our Worst Nodes champion - who's recently introduced an annoying new tactic of replying "anonymously" ... and then sometimes agreeing with himself by replying to the "anonymous" post as sundialsvc4! I don't know why he's started doing this only recently. FWIW, I've often used the goatse operator in production code at work without wrapping it in a sub - though I always provide a one-line comment pointing to the goatse operator documentation. Having discussed the use of this operator during code reviews with a number of serious C++ programmers (but occasional Perl programmers) at work, I can report they've all been happy with this approach because, as skilled programmers, once they became aware it's a standard Perl idiom, they found the perlsecret documentation clear and easy to understand.	[reply]
Re^4: Syntax Perl Version support $c = () = $a =~ /\./g by h2 (Beadle) on Jul 17, 2018 at 22:36 UTC
Re^2: Syntax Perl Version support $c = () = $a =~ /\./g by Anonymous Monk on Jul 24, 2018 at 18:21 UTC
"and tells everyone else exactly what it's supposed to do?" It does. You big dummy.	[reply]