If you've discovered something amazing about Perl that you just need to share with everyone, this is the right place.

This section is also used for non-question discussions about Perl, and for any discussions that are not specifically programming related. For example, if you want to share or discuss opinions on hacker culture, the job market, or Perl 6 development, this is the place. (Note, however, that discussions about the PerlMonks web site belong in PerlMonks Discussion.)

Meditations is sometimes used as a sounding-board — a place to post initial drafts of perl tutorials, code modules, book reviews, articles, quizzes, etc. — so that the author can benefit from the collective insight of the monks before publishing the finished item to its proper place (be it Tutorials, Cool Uses for Perl, Reviews, or whatever). If you do this, it is generally considered appropriate to prefix your node title with "RFC:" (for "request for comments").

User Meditations
Bizarre copy of ARRAY etc - solved
No replies — Read more | Post response
by etj
on May 13, 2024 at 17:07
    A simple XS function in PDL, firstvals_nophys, was giving "panic: attempt to copy freed scalar", but only if called on a complex-valued ndarray. The aim of this post is to appear when a despairing XS programmer googles that message, and give them another thing to check. When constructing a test to capture this, another message that appeared was "Bizarre copy of ARRAY". This is the old text of the function:
    void firstvals_nophys(x) pdl *x PPCODE: if (!(x->state & PDL_ALLOCATED)) barf("firstvals_nophys called on +non-ALLOCATED %p", x); PDL_Indx i, maxvals = PDLMIN(10, x->nvals); EXTEND(SP, maxvals); for(i=0; i<maxvals; i++) { PDL_Anyval anyval = pdl_get_offs(x, i); if (anyval.type < 0) barf("Error getting value, type=%d", anyval +.type); SV *sv = sv_newmortal(); ANYVAL_TO_SV(sv, anyval); PUSHs(sv); }
    The problem was that the ANYVAL_TO_SV macro was, only for complex-valued data, calling a Perl function to create a Math::Complex object (well, a subclass thereof because the overloads were wrong). That obviously uses the top of the stack, including writing values into it, and reading values out of it, including mortal ones that then got freed because they were done with. Therefore, the function was returning with some garbage on the stack, but the last value was correct.

    The solution was simply to do a PUTBACK after the PUSH, which moves the top of the stack above data we care about. The new text of the function with that:

    void firstvals_nophys(x) pdl *x PPCODE: if (!(x->state & PDL_ALLOCATED)) barf("firstvals_nophys called on +non-ALLOCATED %p", x); PDL_Indx i, maxvals = PDLMIN(10, x->nvals); EXTEND(SP, maxvals); for(i=0; i<maxvals; i++) { PDL_Anyval anyval = pdl_get_offs(x, i); if (anyval.type < 0) barf("Error getting value, type=%d", anyval +.type); SV *sv = sv_newmortal(); ANYVAL_TO_SV(sv, anyval); PUSHs(sv); PUTBACK; }
    The commit that fixed this is at https://github.com/PDLPorters/pdl/commit/68389413537c6ea7ed85f121a580b3b008ba82a6. The docs on how to call a Perl function from C are at https://perldoc.perl.org/perlcall, with a full explanation of PUSHMARK, PUSH*, PUTBACK, and (after the call) SPAGAIN, and maybe POP* and maybe then PUTBACK.
Debugger issue solved (two years ago)
2 direct replies — Read more / Contribute
by talexb
on May 06, 2024 at 20:57

    I upgraded a machine to the latest Ubuntu and got Perl 5.34.0, which included a problem with the debugger:

    DB<4> v + Undefined subr +outine &DB::cmd_l called at /usr/share/perl/5.34/perl5db.pl line 6034 +. at /usr/share/perl/5.34/perl5db.pl line 6034. + DB::cm +d_v("v", "", 56) called at /usr/share/perl/5.34/perl5db.pl line 4798 DB::cmd_wrapper("v", "", 56) called at /usr/share/perl/5.34/pe +rl5db.pl line 4311 DB::Ob +j::_handle_cmd_wrapper_commands(DB::Obj=HASH(0x55e85838c150)) called +at /usr/share/perl/5.34/perl5db.pl line 32 00 + DB::DB + called at report_warnings.pl line 56 Debugged program terminated. Use q to quit or R to restart, + use o inhibit_ +exit to avoid stopping after program termination, h q, h R or h o to get additional info. DB<4>
    My friend Google brought me to this page that had links to the necessary patch to perl5db.pl on github, bringing the script from version 1.60 to 1.60_01.

    And, of course, the patch worked just fine. What a great community. Thanks for the patch!

    PS: Ugh, sorry -- the problem was that the v command that I use a lot (Where am I? Oh, there I am!) crashed the debugger. The patch solves that problem.

    Alex / talexb / Toronto

    Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

The Virtue of Laziness
4 direct replies — Read more / Contribute
by jo37
on May 04, 2024 at 17:57

    Dear Monks and Nuns,

    lately I came across an issue with dereferencing array refs. It looks like there is some hidden "lazy deref" when an array ref is used in a foreach loop, compared to the usage as a sub argument. Consider these two subs that do nothing but die. The main difference is the array dereference as an argument to foreach or map.

    use experimental 'signatures'; sub map_die ($ar) { eval {map {die} @$ar}; } sub for_die ($ar) { eval { for (@$ar) { die; } } }
    Benchmarking is amazing:
    use Benchmark 'cmpthese'; my @arr = ((0) x 1e6); cmpthese(0, { map => sub {map_die(\@arr)}, for => sub {for_die(\@arr)}, }); __DATA__ Rate map for map 1257/s -- -100% for 1664823/s 132352% --

    Then I remembered the "lazy generators" from List::Gen and gave it a try. There is some progress, but it cannot come up to foreach.

    use List::Gen 'array'; sub gen_die ($ar) { eval { &array($ar)->map(sub {die}); } } cmpthese(0, { map => sub {map_die(\@arr)}, for => sub {for_die(\@arr)}, gen => sub {gen_die(\@arr)}, }); __DATA__ Rate map gen for map 1316/s -- -93% -100% gen 18831/s 1330% -- -99% for 1662271/s 126174% 8727% --

    Wouldn't it be nice to have some kind of "explicit lazy dereferencing" in Perl?

    Update May 10, 2024: Added the signatures feature as suggested by Danny in Re: The Virtue of Laziness.

    Greetings,
    🐻

    $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$
Imager support for PNG, JPEG and GIF on macOS Sonoma
1 direct reply — Read more / Contribute
by Anonymous Monk
on Apr 20, 2024 at 22:51
    Installing our beloved Imager on macOS Sonoma with support for PNG, JPEG and GIF involves some pain. Here are the results of my struggle so others won't have to. Do this after installing Imager (requires Homebrew):
    brew install pkg-config
    
    brew install libpng
    pkg-config --cflags libpng
    cpan Imager::File::PNG
    
    brew install jpeg
    pkg-config --cflags libjpeg
    cpan Imager::File::JPEG
    
    brew install giflib
    cpan Imager::File::GIF
    
    GIF: Test code failed: Can't link/include 'gif_lib.h', 'stdio.h', 'errno.h', 'string.h', 'gif'...
    ! Configure failed for Imager-File-GIF-0.98. See /Users/you/.cpanm/work/1713652269.80239/build.log for details.
    
    cd /Users/you/.cpanm/work/1713652269.80239/Imager-File-GIF-0.98
    
    perl Makefile.PL -v --incpath=/opt/homebrew/include --libpath=/opt/homebrew/var/homebrew/linked/giflib/lib
    
    make
    make test
    make install
    
    For some reason the following didn't work with the cpan client:
    o conf makepl_arg "LIBS=-L/opt/homebrew/var/homebrew/linked/giflib/lib INC=-I/opt/homebrew/include"
    
    PS - pkg-config can't find giflib so the paths were found like this:
    sudo /usr/libexec/locate.updatedb
    
    locate giflib
    /opt/homebrew/var/homebrew/linked/giflib
    
    locate gif_lib.h        
    /opt/homebrew/Cellar/giflib/5.2.1/include/gif_lib.h
    /opt/homebrew/include/gif_lib.h
    
CPAN autobundle fail
2 direct replies — Read more / Contribute
by Anonymous Monk
on Apr 10, 2024 at 13:50
    I was trying to autobundle an old perl setup but cpan was just sitting there failing to contact mirrors (cpanm user so that mirror list was probably ancient). This inspired me to visit https://www.cpan.org/SITES.html where it says www.cpan.org don't do mirrors anymore. Created the autobundle like so:
    cpan -M https://www.cpan.org -a
UK tax system uses Perl
1 direct reply — Read more / Contribute
by Bod
on Apr 08, 2024 at 10:47

    I've just found out the HMRC (the UK's taxation department) uses Perl for at least some of its operations...

    The system is troubled today and I received this error revealing the language

    Ref: /home/ewf/MODULES/Common/PaymentApiService.pm Error Code 401 at line 30

    Update: corrected typo in title

[OT] The Long List is Long resurrected
3 direct replies — Read more / Contribute
by marioroy
on Apr 06, 2024 at 01:23

    Recently, I tried NVIDIA's nvc++ compiler and noticed a compile time regression (> 40 seconds). That's a long time. I reached out to NVIDIA. Though, no resolution from their side. Some time later, I saw another regression upgrading GCC from version 13 to 14. It seems that std::mutex regressed.

    That spawned a chain-reaction. I logged-appended the March and April 2024 events to my summary page.

    A spinlock mutex class resolved both issues, further improving performance. I reached out to Greg, author of the phmap C++ library. The LLiL challenge is interesting. He shared tips plus the string_cnt struct, accommodating fixed-length and variable-length keys.

    Clang performance regression due to GCC 14
    phmap: Ability to reset the inner sub map
    string_cnt struct commit by Greg

    eyepopslikeamosquito's llil3vec.cpp inspiration (scroll down the page), for making the vector version llil4vec fast (though gobbles memory for unlimited length strings), is the reason efforts to making the map variants catch up, while keeping memory consumption low. There are several map variants. The llil4hmap, llil4emh, and llil4umap demonstrations compute the key hash_value once only, and stores it with with the key.

    llil4map sub maps managed by the C++ library, using phmap::parallel_flat_hash_map
    llil4map2 memory efficient version, vector of pointers to the map key-value pairs
    llil4hmap sub maps managed by the application, using phmap::flat_hash_map
    llil4emh sub maps managed by the application, using emhash7::HashMap
    llil4umap sub maps managed by the application, using std::unordered_map

    Greg Popovitch added a clever new type string_cnt, supporting fixed-length and long strings without recompiling. See enhanced llil4map found in his examples folder, also memory efficient. Note: I beautified the include directives in my demonstrations, matching your version for consistency.

    I glanced through the long thread by eyepopslikeamosquito. At the time, we were thrilled completing in less than 10 seconds. More so below 6 seconds. Fast forward to early 2024, the llil4map demonstration completes in 0.3 seconds, due to linear scaling capabilities (all levels parallel). The llil4vec example may run faster, but stop improving beyond more CPU cores.

    We reached the sub-second territory, processing three input files.

    Limit 12 CPU Cores

    $ NUM_THREADS=12 ./llil4map big{1,2,3}.txt | cksum llil4map (fixed string length=12) start use OpenMP use boost sort get properties 0.311 secs map to vector 0.055 secs vector stable sort 0.095 secs write stdout 0.036 secs total time 0.500 secs count lines 10545600 count unique 10367603 2956888413 93308427 $ NUM_THREADS=12 ./llil4vec big{1,2,3}.txt | cksum llil4vec (fixed string length=12) start use OpenMP use boost sort get properties 0.170 secs sort properties 0.061 secs vector reduce 0.017 secs vector stable sort 0.065 secs write stdout 0.047 secs total time 0.362 secs count lines 10545600 count unique 10367603 2956888413 93308427

    No Limit

    $ ./llil4map big{1,2,3}.txt | cksum llil4map (fixed string length=12) start use OpenMP use boost sort get properties 0.101 secs map to vector 0.052 secs vector stable sort 0.115 secs write stdout 0.029 secs total time 0.298 secs count lines 10545600 count unique 10367603 2956888413 93308427 $ ./llil4vec big{1,2,3}.txt | cksum llil4vec (fixed string length=12) start use OpenMP use boost sort get properties 0.203 secs sort properties 0.088 secs vector reduce 0.024 secs vector stable sort 0.103 secs write stdout 0.029 secs total time 0.449 secs count lines 10545600 count unique 10367603 2956888413 93308427

    Many thanks eyepopslikeamosquito for being there. I promise no more messages for a while. You inspired C++ in me. Next is Taichi Lang.

    Greg Popovitch shared a tip for releasing memory immediately, during "map to vector". Thank you. That was helpful also, for llil4umap.

    Blessings and grace,

Google Research releases... PERL!
3 direct replies — Read more / Contribute
by salva
on Mar 20, 2024 at 10:32
Changes in MooX::Role::Parameterized
No replies — Read more | Post response
by choroba
on Mar 17, 2024 at 16:12

    What is it good for?

    If you’ve never worked with MooX::Role::Parameterized or MooseX::Role::Parameterized, you might wonder what is a parameterized role at all?

    Roles are used when you need to share behaviour among several classes that don’t have to be related by inheritance. Normally, a role just adds a bunch of methods to the class that consumes it (there’s more, you can for example specify which other methods the role expects to already exist).

    A parameterized role makes it possible to provide parameters for the consumed role. This way, you can adjust the behaviour for each consuming class.

    The old syntax

    The standard syntax to apply a role to a class before version 0.100 of the module was to use the apply class method:

    # My/Role.pm package My::Role; use Moo::Role; use MooX::Role::Parameterized; role { my ($params, $mop) = @_; $mop->has($params->{name} => is => $params->{is}); }

    # My/Obj.pm package My::Obj; use Moo; use My::Role; 'My::Role'->apply({ name => 'size', is => 'ro', });

    If we now created an object $o using my $o = 'My::Obj'->new(size => 2), we could get the value of the attribute size using the $o->size getter: the role created a new read-only attribute size for us.

    The old experimental syntax

    What I didn’t like about applying a role to a class the old standard way was it wasn’t declarative. You could easily overlook it as a block of code happening at runtime, while the meaning of the code was This is how a role is consumed. Therefore, I used the alternative experimental syntax:

    package My::Obj; use Moo; use MooX::Role::Parameterized::With 'My::Role' => { name => 'size', is => 'ro', };

    It's part of a use clause, so it’s clear that it’s happening at compile time.

    The new syntax

    I promoted one of my side-jobs to a full-time job recently. They gave me a new computer where I had to install all my code base to start working on it 8 hours a day instead of a couple a month.

    Imagine my surprise when the code stopped with an error:

    Can't locate object method "size" via package "My::Obj" at ./run.pl line 37.

    Line 37 was where I called $o->size!

    When installing the dependencies for my code, the most recent version of MooX::Role::Parameterized was installed from CPAN (0.501). The experimental syntax is no longer documented and as I found out, doesn’t work anymore.

    The old non-experimental syntax still works, but there’s a new syntax, too. It uses the with keyword that looks like the one that can be used to consume a Moo::Role, but if we first use MooX::Role::Parameterized::With, it can also accept parameters for the role application.

    package My::Obj; use Moo; use MooX::Role::Parameterized::With 0.501; with 'My::Role' => { name => 'size', is => 'ro', };

    Moreover, we should change the definition of the role, too. Parameters should be predeclared using the parameter keyword (similarly to MooseX::Role::Parameterized), and they can be then accessed via getters instead of peeking inside a parameter hash reference.

    package My::Role; use Moo::Role; use MooX::Role::Parameterized 0.501; parameter name => (is => 'ro'); parameter is => (is => 'ro'); role { my ($params, $mop) = @_; $mop->has($params->name => is => $params->is); }

    Note: Published to my blog, too.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Moved from 'Today I Learned' (2024-02-14 02:31:10)
No replies — Read more | Post response
by Discipulus
on Feb 13, 2024 at 21:31
    perl -MData::ICal -MData::Dump -e "my $cal = Data::ICal->new(filename => $ARGV[0]); dd @{$cal->entries};" sample.ics
Call for Speakers for the 2024 Carolina Code Conference is open until April 15th
1 direct reply — Read more / Contribute
by brightball
on Jan 09, 2024 at 14:35

    A polyglot conference for all who code!

    Just wanted to drop by and let everyone know that our Call for Speakers has opened for 2024. We're growing this year, doubling from 150 to 300 attendees and a 2 day event in beautiful Greenville, SC.

    https://blog.carolina.codes/p/happy-new-year-call-for-speakers

    If you'd like a look at our speakers from last year, we had a solid line up headlined by keynotes from Charles Nutter (jRuby) and Bruce Tate (7 Languages in 7 Weeks, active with Elixir) for a total of 15 speakers on a single track.

    https://blog.carolina.codes/p/announcing-our-2023-speakers

    We utilize a staggered schedule throughout the day with alternating 30 minute (moderate) and 10 minute (lightning) talks between keynotes to showcase a variety of topics to the audience while keeping everyone engaged. We're committed to maintaining the single track to ensure that our speakers have the entire audience available. There's nothing worse than being selected for a conference and then ending up in a room where nobody shows up to your talk.

    All talks are professionally recorded and published.

    Please let me know if you have any questions. It's an excellent opportunity to showcase cool Perl things to a broad audience.

Have you ever lost your work?
9 direct replies — Read more / Contribute
by harangzsolt33
on Jan 08, 2024 at 12:22
    Recently I was writing a simple script (maybe 200 lines total), and I did something quickly, which I forgot what it was. But it deleted my script completely without trace!!! I may have accidentally pressed a bad key combination in Notepad2 or I don't know what happened. I ran the script. It ran without errors. It was finally working the way I wanted it. So, I closed the editor. And originally, I saved it on my desktop, and now it was gone. It wasn't in the Trash bin either. It was completely gone! How did it disappear in a flash? Have you ever had a similar experience where you worked on something and it mysteriously disappeared without a trace?

    I have a program called Recuva which runs on Windows and looks for deleted files and tries to restore them. But it did not find the file I was looking for. I was really surprised, because usually if you just delete something and then immediately go to Recuva, it will find that file. And chances are it may still be able to restore it. But it couldn't even locate the file. So, my next thought was I'm going to open HxD which is a hex editor which can view files, memory, or disks. I selected the main hard drive, which is 2TB. And I thought, how am I going to find this script? I thought of a unique line which is something that appears in my script that likely does not appear anywhere else, and I typed that into search. I thought, this will take forever. But no! It found it within 1 minute, and I was able to salvage my script! It's a miracle!!! This is one reason why you don't want to encrypt your hard drive. Lol

    ( I guess, this post could be formatted as a question for vote, but I'm not sure. It's just something I thought of. Not really Perl related or PerlMonks website related, so I wasn't sure where to write this. )

Speed of simple pattern count. A comparison
1 direct reply — Read more / Contribute
by rsFalse
on Jan 06, 2024 at 21:53
    Hello,

    I've just played with various ways to count 1-2 letter patterns in the longer string, and compared speed. If the task is to count the number of exact substring in non-overlapping manner, then there are many ways how to do it. And if to count a single ASCII character, then there are even more ways.
    Some ways (e.g. functions/chop and functions/chomp) destruct the target string, so I did the copy of it every time.
    Used several perl versions including up to 5.38.2, year 2023.

    UPDATE. After jwkrahn comment (Re: Speed of simple pattern count. A comparison) I added variants with functions/index and functions/rindex. Also I added functions/substr variant, which simply takes by one character (I did no included this variant with 2 character long pattern, because it also needs to increase position if I search for non-overlapping matches).

    Count a single character variations and speed:
    #!/usr/bin/perl use strict; use warnings; use Benchmark 'cmpthese'; my $target = 'abc' x 1e4; cmpthese(-1,{ 'y' => sub { my $m = 0; $m = $target =~ y/a//; }, '=()=' => sub { my $m = 0; $m = () = $target =~ m/a/g; }, 'while' => sub { my $m = 0; $m ++ while $target =~ m/a/g; }, '(?{})' => sub { my $m = 0; $target =~ m/a(?{ $m ++ })(*F)/g; +}, 'split_grep' => sub { my $m = 0; $m = grep $_ eq 'a', split '', $target; }, 'split_by' => sub { my $m = 0; $m = -1 + split 'a', 'x' . $target . 'x'; }, 'chop' => sub { my $m = 0; my $target2 = $target; $m += ( 'a' eq chop $target2 ) while length $target2; }, 'chomp' => sub { my $m = 0; my $target2 = $target; local $/ = 'a'; 0 while chomp $target2 and ++ $m or chop $target2; }, 'index' => sub { my $m = 0; my $pat_len = length 'a'; my $pos = -$pat_len; $m ++ while -1 < ( $pos = index $target, 'a', $pos + $pat_len +); }, 'rindex' => sub { my $m = 0; my $pat_len = length 'a'; my $pos = -1 + length $target; $m ++ while -1 < ( $pos = rindex $target, 'a', $pos - $pat_len + ); }, 'substr' => sub { my $m = 0; $m += ( 'a' eq substr $target, $_, 1 ) for 0 .. -2 + length $t +arget; }, });
    OUTPUT:
    perl-5.38.2 ========== Rate split_grep (?{}) substr chop chomp =()= while i +ndex rindex split_by tr / y split_grep 214/s -- -4% -23% -36% -47% -68% -70% +-77% -79% -99% -99% (?{}) 222/s 4% -- -20% -34% -45% -67% -69% +-77% -78% -98% -99% substr 279/s 30% 26% -- -17% -31% -59% -61% +-71% -73% -98% -99% chop 336/s 57% 51% 20% -- -16% -50% -53% +-65% -67% -98% -99% chomp 403/s 88% 81% 44% 20% -- -40% -44% +-58% -61% -97% -99% =()= 673/s 214% 203% 141% 100% 67% -- -6% +-29% -34% -95% -98% while 718/s 235% 223% 157% 114% 78% 7% -- +-24% -30% -95% -98% index 948/s 342% 326% 240% 182% 135% 41% 32% + -- -8% -93% -97% rindex 1028/s 380% 362% 268% 206% 155% 53% 43% + 8% -- -93% -97% split_by 14354/s 6599% 6359% 5045% 4170% 3466% 2032% 1899% 1 +415% 1297% -- -59% tr / y 34600/s 16047% 15470% 12301% 10192% 8496% 5039% 4718% 3 +551% 3267% 141% -- perl-5.32.0 ========== Rate split_grep (?{}) substr chomp chop =()= while r +index index split_by tr / y split_grep 163/s -- -16% -45% -46% -53% -68% -78% + -83% -83% -99% -100% (?{}) 194/s 19% -- -34% -36% -44% -62% -73% + -79% -79% -99% -99% substr 296/s 82% 52% -- -2% -14% -42% -59% + -68% -68% -98% -99% chomp 302/s 85% 55% 2% -- -13% -41% -59% + -68% -68% -98% -99% chop 345/s 112% 78% 17% 14% -- -32% -53% + -63% -63% -98% -99% =()= 511/s 213% 163% 72% 69% 48% -- -30% + -46% -46% -97% -99% while 731/s 348% 276% 147% 142% 112% 43% -- + -22% -22% -96% -98% rindex 939/s 476% 383% 217% 211% 172% 84% 28% + -- -0% -95% -97% index 939/s 476% 383% 217% 211% 172% 84% 28% + 0% -- -95% -97% split_by 18553/s 11271% 9442% 6162% 6048% 5272% 3532% 2436% +1876% 1875% -- -46% tr / y 34462/s 21022% 17623% 11531% 11319% 9878% 6645% 4611% +3570% 3568% 86% -- perl-5.20.1 ========== Rate split_grep substr chop =()= chomp (?{}) while +index rindex split_by tr / y split_grep 181/s -- -25% -38% -41% -41% -54% -73% + -74% -74% -99% -99% substr 241/s 33% -- -17% -21% -21% -38% -64% + -65% -66% -99% -99% chop 290/s 61% 20% -- -5% -5% -26% -57% + -58% -59% -98% -99% =()= 304/s 68% 26% 5% -- -1% -22% -55% + -56% -57% -98% -99% chomp 307/s 70% 27% 6% 1% -- -22% -54% + -56% -56% -98% -99% (?{}) 392/s 117% 62% 35% 29% 28% -- -42% + -44% -44% -98% -99% while 673/s 273% 179% 132% 121% 120% 72% -- + -3% -4% -96% -98% index 697/s 286% 189% 140% 129% 127% 78% 4% + -- -1% -96% -98% rindex 704/s 289% 192% 142% 131% 129% 80% 5% + 1% -- -96% -98% split_by 16747/s 9166% 6845% 5665% 5403% 5360% 4177% 2387% +2302% 2280% -- -51% tr / y 34296/s 18876% 14124% 11707% 11170% 11081% 8658% 4994% +4819% 4773% 105% --
    Transliteration (perlop#tr y / tr) is absolutely fastest.
    Splitting by search value is second fastest. Although it counts as inverse, i.e. the number of not matched chunks. Here is important to note edge cases: if the pattern matches right at the beginning and/or right on the end, therefore I add 'x' at both sides, which should not be a substring of the pattern.
    Other ways are way slower.
    If look across versions, I can spot that (?{})(perlre#(?{-code-})) became about 2x slower between 5.20 and 5.32. The results of perl 5.14 (not shown here) are similar to 5.20, except that I need to use 'our $m' variable with (?{}) variant.

    And here are variations (less than searching for single-char) with pattern of two characters:
    cmpthese(-1,{ '=()=' => sub { my $m = 0; $m = () = $target =~ m/ab/g; }, 'while' => sub { my $m = 0; $m ++ while $target =~ m/ab/g; }, '(?{})' => sub { my $m = 0; $target =~ m/ab(?{ $m ++ })(*F)/g; + }, 'split_by' => sub { my $m = 0; $m = -1 + split 'ab', 'x' . $target . 'x'; }, 'chomp' => sub { my $m = 0; my $target2 = $target; local $/ = 'ab'; 0 while chomp $target2 and ++ $m or chop $target2; }, 'index' => sub { my $m = 0; my $pat_len = length 'ab'; my $pos = -$pat_len; $m ++ while -1 < ( $pos = index $target, 'ab', $pos + $pat_len + ); }, 'rindex' => sub { my $m = 0; my $pat_len = length 'ab'; my $pos = -1 + length $target; $m ++ while -1 < ( $pos = rindex $target, 'ab', $pos - $pat_le +n ); }, });
    OUTPUT:
    perl-5.38.2 ========== Rate (?{}) chomp =()= while rindex index +split_by (?{}) 216/s -- -64% -70% -75% -78% -81% + -96% chomp 599/s 177% -- -17% -31% -40% -47% + -90% =()= 718/s 232% 20% -- -18% -28% -36% + -88% while 872/s 303% 46% 21% -- -13% -22% + -85% rindex 999/s 362% 67% 39% 15% -- -11% + -83% index 1120/s 418% 87% 56% 28% 12% -- + -81% split_by 5749/s 2559% 860% 701% 559% 475% 413% + -- perl-5.32.0 ========== Rate (?{}) chomp =()= while index rindex +split_by (?{}) 228/s -- -59% -64% -75% -77% -78% + -95% chomp 555/s 143% -- -12% -39% -44% -47% + -89% =()= 627/s 175% 13% -- -31% -37% -41% + -88% while 913/s 300% 65% 46% -- -9% -13% + -82% index 999/s 338% 80% 59% 9% -- -5% + -80% rindex 1056/s 362% 90% 68% 16% 6% -- + -79% split_by 5046/s 2110% 810% 705% 452% 405% 378% + -- perl-5.20.1 ========== Rate =()= (?{}) chomp rindex index while +split_by =()= 372/s -- -19% -22% -54% -54% -56% + -96% (?{}) 462/s 24% -- -3% -43% -43% -45% + -95% chomp 478/s 29% 4% -- -41% -41% -43% + -95% rindex 807/s 117% 75% 69% -- -1% -3% + -92% index 814/s 119% 76% 70% 1% -- -3% + -92% while 836/s 125% 81% 75% 4% 3% -- + -91% split_by 9752/s 2524% 2013% 1941% 1108% 1099% 1066% + --
    Split by search pattern is way faster than other variants. But to use it I need to think about edge cases and possible overlapping after appending or prepending additional symbols. Update. However splitting by search pattern seems to become almost 2x slower somewhere between perl-5.20 and 5.32.

    Update-2. Using while chop and while chomp it is important to note, that condition will terminate if pattern or chopped character was 0, therefore to overcome this limitation I should have added length, e.g. while length cho(m)p.
Tip: Create a "working" perl lib for your personal use
2 direct replies — Read more / Contribute
by nysus
on Jan 05, 2024 at 14:38

    This will sound silly, but it only recently dawned on me that a code library isn't just for finished, polished code. It can also be for code that are works in progress that may not be pretty and far from perfect, but is still useful in helping you get actual work done. This summarizes a basic recipe for setting up a "working" library of modules and my workflow for developing these modules.

    But first, what exactly do I mean by a "working" library? It's a library for utilitarian modules that you have written to help you in your day-to-day coding or work. The modules are not polished enough or are too specific to release as a formal distribution to CPAN. But they still need tests to make sure they work. For example, I have a collection of modules I use to help me set up and configure WordPress sites on my local machine in Docker. Since these are for my own use, I don't bother setting up a git repo for these module or track issues with them or version them. I just write a test, update the module to pass the test, and then make use of the module right away. Except for comments, I don't bother documenting the modules and most of them are simple enough where the documentation isn't really needed. The goal is to keep the admin overhead of these modules to a minimum while still ensuring the modules work well enough.

    So here's what I did to set up my working library:

    1. Created a new directory for my modules. I have it set to ~/perl5/working_modules
    2. Added this path to the $PERL5LIB environment variable
    3. Added my modules to this directory
    4. Added a test directory to working_modules for holding tests ~/perl5/working_modules/t. But this directory can be anywhere on your hard drive. It doesn't not have to be in working_modules.
    5. Placed any tests into the t directory. I organized the tests using the same structure as my modules in the working_module directory. So a module called Some::Module, has tests in the working_modules/t/Some/Module/ directory

    The workflow is simple test driven development: First, write a test in the appropriate test file for the module you are going to add a new feature to and then write the code and run the test. Then get right back to work. I no longer have to worry about whether my tests are including the right path to my modules. Since all my modules are in the $PERL5LIB, I don't have that headache anymore. Before using a working library, I had modules spread out all over my hard drive for different projects I was working on making things very difficult and my code not very reusable or easy to find. Now I just throw a use statement into the code and I'm done.

    This obviously isn't anything groundbreaking or new. But if you're anything like me, it took a while before it dawns on you that useful, utilitarian code can dispense with a lot of formalities and you don't have to wait for it to be perfect or even good before you put it into a library.

    $PM = "Perl Monk's";
    $MC = "Most Clueless Friar Abbot Bishop Pontiff Deacon Curate Priest Vicar Parson";
    $nysus = $PM . ' ' . $MC;
    Click here if you love Perl Monks

Using regex as an alternative to usual loops on 1D data (Using surrogate string)
No replies — Read more | Post response
by rsFalse
on Dec 30, 2023 at 14:28
    Hello,

    Here I describe an idea and share several examples of using surrogate string to loop over its characters with regex, mimicking traditional loops. Such regexes contain evaluation blocks (?{}) or (??{}). In these blocks we can manipulate on array elements, and the indexing comes not with traditional variables i, j, k (C-style or foreach loops), but rather with regex variables of matching positions - pos, $-[0], $+[0] (e.g. perldocs -- @ ).

    This short essay is a sister of Using regex as an alternative to usual loops on 1D data. Differences: instead of surrogate string the array elements are joined by particular separator, but it requires checking for if separator is not included in data.

    These ideas are for comparison purposes, and TMTOWTDI; I do not believe regex code may be faster, and readability is hardly better.

    The target (surrogate) string is generated by:
    ',' x scalar @array;

    ...and the backbone of regex is:
    m/.(?{ $array[ pos ] = do_smth(); })(*FAIL)/;
    ...which we can expand. We match one or more characters and do manipulations with array elements accessing them with $array[ pos ] or so.

    Here I show an example program. It calculates sum of absolute differences between consecutive array elements:
    #!/usr/bin/perl -wl use strict; my @A = ( -5, 3, 1, -2 ); my $acc = 0; for my $i ( 0 .. @A - 2 ){ $acc += abs( $A[ $i ] - $A[ $i + 1 ] ); } print $acc; $acc = 0; ( ',' x ( @A - 1 ) ) =~ / . (?{ $acc += abs( $A[ $-[0] ] - $A[ $-[0] + 1 ] ); }) (*FAIL) /x; print $acc;
    OUTPUT:
    13 13
    Next example is a 'TRIANGLE' loop (loop in loop). I use here greedy .* to compare pairs of consecutive elements backwards. This acts as bubble sort. Program starts with traditional 'for' loop and alternatively -- regex "loop" on surrogate string.
    #!/usr/bin/perl -wl use strict; my @A = qw( d c b a ); print "@A"; for my $i ( 0 .. @A - 2 ){ for my $j ( reverse $i .. @A - 2 ){ print ' ' x $i . "<$A[ $j ]> cmp <$A[ $j + 1 ]>"; } } print "-" x 5; ( ',' x ( @A - 1 ) ) =~ m/ . .* (?{ print ' ' x $-[0] . "<$A[ $+[0] - 1 ]> cmp <$A[ $+[0] ]>"; }) (*FAIL) /x; print "-" x 5; ( ',' x ( @A - 1 ) ) =~ m/ . .* (?{ $A[ $+[0] - 1 ] gt $A[ $+[0] ] and ( $A[ $+[0] - 1 ], $A[ $+[0] ] ) = reverse ( $A[ $+[0] - 1 ], $A[ $+[0] ] ); print "--@A"; }) (*FAIL) /x; print "@A";
    OUTPUT:
    d c b a <b> cmp <a> <c> cmp <b> <d> cmp <c> <b> cmp <a> <c> cmp <b> <b> cmp <a> ----- <b> cmp <a> <c> cmp <b> <d> cmp <c> <b> cmp <a> <c> cmp <b> <b> cmp <a> ----- --d c a b --d a c b --a d c b --a d b c --a b d c --a b c d a b c d
    Similar example -- selection sort (with non-greedy .*?, meaning forward direction):
    #!/usr/bin/perl -wl use strict; my @A = qw( d c b a ); print "@A"; for my $i ( 0 .. @A - 2 ){ for my $j ( $i .. @A - 2 ){ print ' ' x $i . "<$A[ $j ]> cmp <$A[ $j + 1 ]>"; } } print "-" x 5; ( ',' x ( @A - 1 ) ) =~ m/ . .*? (?{ print ' ' x $-[0] . "<$A[ $+[0] - 1 ]> cmp <$A[ $+[0] ]>"; }) (*FAIL) /x; print "-" x 5; my $jmin; ( ',' x ( @A - 1 ) ) =~ m/ . (?{ $jmin = $+[0] - 1; }) .*? (?{ $A[ $jmin ] gt $A[ $+[0] ] and $jmin = $+[0]; }) $ (?{ $jmin != $-[0] and ( $A[ $-[0] ], $A[ $jmin ] ) = reverse ( $A[ $-[0] ], $A[ $jmin ] ); print "--@A"; }) (*FAIL) /x; print "@A";
    OUTPUT:
    d c b a <d> cmp <c> <c> cmp <b> <b> cmp <a> <c> cmp <b> <b> cmp <a> <b> cmp <a> ----- <d> cmp <c> <c> cmp <b> <b> cmp <a> <c> cmp <b> <b> cmp <a> <b> cmp <a> ----- --a c b d --a b c d --a b c d a b c d
    Next example is about traversing array by taking not 1 or 2 but 3 elements (consecutive). Moreover, in this example, it moves with doubled step size. First two sub-examples are written in normal C-for and foreach loops, and third is regex. I use (*SKIP) verb to skip some positions of matching, in this case I skip one. No manipulation, only printing of array values.
    An example is analogic to examples in sister node.
    #!/usr/bin/perl -w use strict; my @A = ( 1 .. 3, 'abc', 'zz', 79, 444, 5 ); for( my $i = 0; $i < @A - 2; $i += 2 ){ print "[$A[ $i ]-$A[ $i + 1 ]-$A[ $i + 2 ]]"; } print "\n"; for my $i ( grep $_ % 2 == 0, 0 .. @A - 3 ){ print "[$A[ $i ]-$A[ $i + 1 ]-$A[ $i + 2 ]]"; } print "\n"; ( ',' x ( @A - 1 ) ) =~ m/ (,) (,)(*SKIP) (,) (?{ print "[$A[ $-[ 1 ] ]-$A[ $-[ 2 ] ]-$A[ $-[ 3 ] ]]" }) # (?{ local $" = '-'; print "[@A[ @-[ 1 .. 3 ] ]]" }) # same outpu +t # (?{ local $" = '-'; print "[@A[ $-[ 0 ] .. ( pos ) - 1 ]]" }) # +same output (*FAIL) /x; print "\n";
    OUTPUT:
    [1-2-3][3-abc-zz][zz-79-444] [1-2-3][3-abc-zz][zz-79-444] [1-2-3][3-abc-zz][zz-79-444]
    If we manipulate two or more elements, we can choose how many surrogate symbols to match. If we match only first symbol, then we use one variable, e.g. $-[ 0 ], and the indexes of consecutive elements would be: $-[ 0 ] + 1, $-[ 0 ] + 2, ... Otherwise we match all symbols and indexes we get from @- or @+ arrays.

    Thanks for reading!

Add your Meditation
Title:
Meditation:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":


  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.