Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Meditations

( #480=superdoc: print w/replies, xml ) Need Help??

If you've discovered something amazing about Perl that you just need to share with everyone, this is the right place.

This section is also used for non-question discussions about Perl, and for any discussions that are not specifically programming related. For example, if you want to share or discuss opinions on hacker culture, the job market, or Perl 6 development, this is the place. (Note, however, that discussions about the PerlMonks web site belong in PerlMonks Discussion.)

Meditations is sometimes used as a sounding-board — a place to post initial drafts of perl tutorials, code modules, book reviews, articles, quizzes, etc. — so that the author can benefit from the collective insight of the monks before publishing the finished item to its proper place (be it Tutorials, Cool Uses for Perl, Reviews, or whatever). If you do this, it is generally considered appropriate to prefix your node title with "RFC:" (for "request for comments").

User Meditations
Rosetta Code: Long List is Long
7 direct replies — Read more / Contribute
by eyepopslikeamosquito
on Nov 30, 2022 at 17:27

    I've long found it fun to implement the same algorithm in different languages, especially Perl and C++ ... and then sit back and reflect on the lessons learned ... so when Long list is long appeared recently, I felt it was short and interesting enough to make an excellent Rosetta code node.

    Solutions to this problem must read a number of input LLiL-format files (given as command line arguments) and write a single merged LLiL-format file to stdout. The LLiL-format is described in the comments at the top of llil.pl below.

    In the interests of keeping the code as short and fast as possible, you may assume the input LLiL files are well-formed. For example, you don't need to check for and remove leading and trailing whitespace on each line. The sample solutions given below in Perl and C++ should clarify program requirements.

    Please feel free to respond away with solutions to this problem in your favourite programming language and to offer suggested improvements to my sample Perl and C++ solutions below.

    Perl Solution

    Here's my Perl solution, heavily influenced by responses to Long list is long, especially kcott's concise and clear solution:

    # llil.pl # Example run: perl llil.pl tt1.txt tt2.txt >oo1.tmp use strict; use warnings; # -------------------------------------------------------------------- +-- # LLiL specification # ------------------ # A LLiL-format file is a text file. # Each line consists of a lowercase name a TAB character and a non-neg +ative integer count. # That is, each line must match : ^[a-z]+\t\d+$ # For example, reading the LLiL-format files, tt1.txt containing: # camel\t42 # pearl\t94 # dromedary\t69 # and tt2.txt containing: # camel\t8 # hello\t12345 # dromedary\t1 # returns this hashref: # $hash_ret{"camel"} = 50 # $hash_ret{"dromedary"} = 70 # $hash_ret{"hello"} = 12345 # $hash_ret{"pearl"} = 94 # That is, values are added for items with the same key. # # To get the required LLiL text, you must sort the returned hashref # descending by value and insert a TAB separator: # hello\t12345 # pearl\t94 # dromedary\t70 # camel\t50 # To make testing via diff easier, we further sort ascending by name # for lines with the same value. # -------------------------------------------------------------------- +-- # Function get_properties # Read a list of LLiL-format files # Return a reference to a hash of properties sub get_properties { my $files = shift; # in: reference to a list of LLiL-format fil +es my %hash_ret; # out: reference to a hash of properties for my $fname ( @{$files} ) { open( my $fh, '<', $fname ) or die "error: open '$fname': $!"; while (<$fh>) { chomp; my ($word, $count) = split /\t/; $hash_ret{$word} += $count; } close($fh) or die "error: close '$fname': $!"; } return \%hash_ret; } # ----------------- mainline ----------------------------------------- +-- @ARGV or die "usage: $0 file...\n"; my @llil_files = @ARGV; warn "llil start\n"; my $tstart1 = time; my $href = get_properties( \@llil_files ); my $tend1 = time; my $taken1 = $tend1 - $tstart1; warn "get_properties : $taken1 secs\n"; my $tstart2 = time; for my $key ( sort { $href->{$b} <=> $href->{$a} || $a cmp $b } keys % +{$href} ) { print "$key\t$href->{$key}\n"; } my $tend2 = time; my $taken2 = $tend2 - $tstart2; my $taken = $tend2 - $tstart1; warn "sort + output : $taken2 secs\n"; warn "total : $taken secs\n";

    What makes this problem interesting to me is the requirement to sort the hash in descending order by value:

    sort { $href->{$b} <=> $href->{$a} || $a cmp $b } keys %{$href}
    because the performance of such a sort may suffer when dealing with huge files (after all, performance was the reason for the OP's question in the first place).

    I'm hoping solving this problem in multiple languages will be fun and instructive -- and perhaps give us insight into how performance changes as the number of items increases.

Code brewing for the upcoming MCE 10 year anniversary
4 direct replies — Read more / Contribute
by marioroy
on Oct 23, 2022 at 09:12

    Greetings, all

    The following is a glimpse of what's coming for MCE. There are two new modules; MCE::Semaphore and MCE::Simple. I completed the code tonight. Now, I need to finish the docs and more testing before releasing on Meta::CPAN.

    MCE Simple

    use MCE::Simple -strict, max_workers => 4; MCE::Simple->init( # mce options user_begin => sub { MCE->say("hello from ", MCE->wid); }, # spawn options on_finish => sub { my ( $pid, $exit, $ident, $signal, $error, @ret ) = @_; say "@_"; }, ); mce_foreach my $i ( 1..10 ) { MCE->say(MCE->wid, ": $i * 2 = ", $i * 2); } spawn "Hello", sub { "one" }; spawn "There", sub { "two" }; foreach my $ident (qw/foo baz/) { spawn $ident, sub { my $text = "something from $$"; }; } sync; # clear or set options MCE::Simple->init(); sub fib { my $n = shift; return $n if $n < 2; spawn my $x = fib($n - 1); spawn my $y = fib($n - 2); sync $x; sync $y; return $x + $y; } say "fib(20) = ", fib(20);

    Output

    $ perl demo.pl hello from 1 hello from 2 hello from 3 hello from 4 3: 2 * 2 = 4 4: 1 * 2 = 2 2: 3 * 2 = 6 3: 5 * 2 = 10 4: 6 * 2 = 12 1: 4 * 2 = 8 4: 7 * 2 = 14 3: 8 * 2 = 16 2: 9 * 2 = 18 1: 10 * 2 = 20 10792 0 Hello 0 one 10793 0 There 0 two 10794 0 foo 0 something from 10794 10795 0 baz 0 something from 10795 fib(20) = 6765

    MCE Semaphore

    A fast pure-Perl implementation. Testing involves big number. Notice the mce_foreach_s keyword, for processing a sequence of numbers.

    use MCE::Simple -strict, max_workers => 16; use MCE::Semaphore; use Time::HiRes 'time'; my $start = time; my $sem = MCE::Semaphore->new(8); mce_foreach_s ( 1 .. 1_000_000 ) { $sem->down; $sem->up; } printf "%0.3f seconds\n", time - $start;

    Input file

    mce_foreach_f ("/path/to/file") { MCE->print($_); } # what about file handles? no problem... open my $fh, "<", "/tmp/infile.txt" or die "open error: $!"; mce_foreach_f my $line ( $fh ) { MCE->print($line); } close $fh;
How not to implement updaters
2 direct replies — Read more / Contribute
by afoken
on Sep 30, 2022 at 17:25

    Every two weeks, I switch from embedded developer to network and server administrator to keep our network and servers at work up and running. Today, updating our issue/requirement/test tracking software was on the plan. We have four virtual machines, each running one instance of the software. I won't state its name, and I will neither confirm nor deny any guess. But let's say the manufacturer has recently demonstrated in that their idea of forcing their clients to use the cloud variant of their software instead of local servers might not be the best idea. Users don't like having years of work deleted from the cloud servers, without a way to undo that quickly and completely.

    Experience from previous updates has taught me to make a full backup of the entire VM before updating. So the day started with shutting down all four VMs and creating copies of their harddisk image files. Just to be sure. The VMs are relatively small, just a bare-bones installation of Debian plus a database plus the bugtracker software, so that four extra copies of the HDD images don't matter much.

    I planned the entire day for the update, expecting some trouble with the first VM to learn about the new issues during the update, and then be able to update the three other VMs much faster, knowing what issues to expect. So I was absolutely not surprised that the first update went bonkers.

    Act One

    The update installer did created some zip files of the existing installation (don't hope to be able to recover from a broken update using those zip files), then removed the entire old version of the bugtracker software and unpacked the new version. "Do you want me to overwrite some.freaking.dll in the program directory?" Sure, why not? If the installer wants to overwrite what was unpacked seconds ago, let it do so. I have a good HDD image. A few moments later, it started the web server and pointed me to http://localhost:someport/. No, that web interface does not work in lynx or links, we are running a server, not a point-and-shoot adventure game. But the web server is really listening on all interfaces, so I can connect using Firefox on my PC. After several minutes of the old "don't blink, you might miss the progress bar moving another pixel" game, the browser shows the well-known "oops, something went wrong" page.

    "We can't talk to the database." Well, the old version could. The old version had a database config file stating that we use a really exotic database. You probably never heard of it. It is called MySQL. Right out of the Debian package (so it is actually MariaDB). After some clicking on the eror page, you end at a wiki page of the manufacturer, which tells you to download a MySQL driver from a third party page. Yes, I really know that issue, and I should have thought about it, because it happened with every single update so far. It must be incredible hard to parse the database config file from the updater and instruct the admin right from the updater to download and install that driver BEFORE playing the waiting game. And it must be absolutely impossible just to bundle the driver like the tons of other crap that come with the software.

    So, copy the driver file (it really is just a single file!) to /opt/crap/crap/crap/lib/, restart the server, play the waiting game again. "Oops, something went wrong." Yes, sure. "We can't detect the database version." I could not care less. "We just discarded your old startup configuration, here is a link to our wiki how to fix that." Oh well, it's just fine-tuning of how much memory the bugtracker wastes. Defaults are fine for now. "There is an expired license installed, you are only allowed to update to versions that were released before that license expired." What? "Click here to buy a new license, click here to enter the new license code." There is no way to bypass that.

    I share the administration of the bugtracker with a coworker. She does the high-level stuff (workflow, addons and so on), I care about OS, database, network, backup, and basic installation. She told me that we don't actually use that license. The license is not for the bugtracker itself, it is for a component that wasn't even installed in the old version. The expired license is just garbage data, we don't use that component, we don't need that component. It once was installed, but nobody bothered to delete the license code.

    To make matters worse, there is no way to delete the expired license, or just tell the installer that we are willing not to be able to use the unlicensed component. At this point, you can either pay a lot of money to renew a license for a component that you don't want and don't need, just to get past that error screen, or shut down the VM and copy the backup copy of the HDD image over the actal HDD image. I did the latter.

    Act Two

    Restart the VM, remove that left-over license code, redo the update installation, this time copying the database driver before starting the webserver. "You were updated". No, the updater managed to do its job of updating the bugtracker. "Oh, and by the way, we are just rebuilding our search index. Because, you know, we can't search in the database." Actually, the last sentence was not displayed. But you have to wait for the index rebuild job has finished before you can continue.

    Well, the updater did not manage to do its job. "There are this 20+ apps that won't work for whatever reason." Good, let's see if the bugtracker does work at all. The personalized overview page displays fine, but where is the navigation bar? It's gone. You can't log out. You can't gain admin privileges. You can't navigate anywhere. Let's open an existing issue. "500 Internal Server Error - click here to see a long, useless stack trace and a random number that will identify this problem". Some other attempts of navigating elsewhere also ended in that 500 page. Well, that did not go well.

    Half a day has passed, and we just managed to kill the first bugtracker VM twice. Or, to be precise, watch it commit suicide. Guess what? Shutdown, copy the backup once more over the actual HDD image, and retry a third time.

    Act Three

    My coworker thought that one of that many add-ons that she installed might be responsible for the trouble. (I don't know why we need 20+ addons, we use the core functions, plus an add-on for requirements, plus one add-on for tests, plus one add-on for making the search function work properly.) So she decided to clean up the mess, uninstall everything not needed, including that expired license.

    It turns out that not everything uninstalled cleanly. "Something went wrong that you don't need to know. But if you really want to know, here is a link to an assistent that will tell you that we wrote some stack traces to one of the many log files." A 16 MByte log file. 390 kByte of which were created during the hour or so she tried to get rid of some garbage.

    Well, shut down the VM, make a second copy of the HDD image just to have a slightly cleaner state to work from. Redo the update, again copying the database driver. After the waiting game, I'm greeted by the same "You were updated" screen, and only three add-ons are inoperable now. A few clicks later, I once again get the overview page. Almost any click gets me either to a much uglier 500 page than before, or to the pretty 500 page. "Click here to download an archive with all relevant data you can mail to our support." Click - "500 Internal Server Error". Yes, you can't even download the crash report archive.

    VM suicide number three. After a short discussion, we decided to roll back to the very first backup I created in the morning. Copy the backup once again over the actual HDD image, start the VM again. Nearly eight hours have passed. We did not even try to update the three other VMs, we just started them in their old state.

    Epilog

    We wasted an entire day trying to update the software. It should be so simple. Run the update installer, add the database driver that the manufacturer does not bundle, watch the system update itself, run the new version. Or, if something is critical and might cause touble, get a good error message from the update installer BEFORE f-ing up the entire system.

    It is possible. I know it, because my main job is software development. It takes testing, and during testing and development, you (as the developer) expect things to go horrible wrong. That's why VMs are so great. One click and you are back to a known state that you can fail to update again, and again, and again, until the updater just works or stops before damaging the system. With embedded systems, reverting to a known state is not always that easy, but even there, it is possible to make updates just work or abort before things go wrong.

    I don't really want to know why the updater managed to kill our system three times, I just want it to do its job. Luckily, I'm on vacation now, and my coworker will contact the manufacturer of this crappy software. After my vacation, we will see how far she got.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
CSS mods for the new metacpan layout
1 direct reply — Read more / Contribute
by hippo
on Sep 30, 2022 at 11:14

    You've probably seen the new styling of MetaCPAN by now. One anonymous monk is less than enthralled. It's not all bad, IMHO and will hopefully improve over time. Meanwhile here is the little snippet of userContent.css which I've put together today to restore a little sanity.

    Update 6th Oct 2022 with a fix for the search results and similar one-pane pages:

    @-moz-document url-prefix(https://metacpan.org/) { div.page-content { grid-template-columns: 200px calc(100vw - 200px +) !important; } .no-sidebar div.page-content { grid-template-columns: 1fr !importa +nt; } ul.nav-list { padding: 10px !important; width: 200px !important } ul.nav-list>li a, ul.dependencies>li>a { color: #337ab7 !important + } div.content { padding: 20px !important } #index-container { margin-left: 20px !important } }

    Original attempt was:

    @-moz-document url-prefix(https://metacpan.org/) { div.page-content { grid-template-columns: 200px calc(100vw - 200px +) !important; } ul.nav-list { padding: 10px !important; width: 200px !important } ul.nav-list>li a, ul.dependencies>li>a { color: #337ab7 !important + } div.content { padding: 20px !important } #index-container { margin-left: 20px !important } }

    This will:

    • Reduce the left nav from 300 to 200 pixels in width and reduce the padding on the main content so there is not so much wasted space (Don't ask me why they've gone with a fixed pixel width here to begin with)
    • Re-enable the different styling (colour) of links in the nav so you can tell what is a link and what is just info once again

    We'll see how much this needs tweaking over the next little while but at least if you are interested in this it saves us all reverse-engineering it independently.

    The new layout proposal and discussion is in the issues here.


    🦛

The new black metacpan (meta::cpan throws away brand)
No replies — Read more | Post response
by Anonymous Monk
on Sep 30, 2022 at 06:15
LWP::UserAgent Client-Warning 500 against HTTP standards?
4 direct replies — Read more / Contribute
by Discipulus
on Sep 30, 2022 at 03:35
    Hello community,

    being our halls so quite in these days I'm lazily inviting you to meditate about LWP::UserAgent behaviour returning 500 when LWP can't connect to some URL or when other failures in protocol handlers occur.

    Is this breaking HTTP specification? If ever glanced current rfc or not you should know that all 5** status code are server side.

    The LWP doumentation is very clear on this:

    > There will still be a response object returned when LWP can't connect to the server specified in the URL or when other failures in protocol handlers occur. These internal responses use the standard HTTP status codes, so the responses can't be differentiated by testing the response status code alone. Error responses that LWP generates internally will have the "Client-Warning" header set to the value "Internal response". If you need to differentiate these internal responses from responses that a remote server actually generates, you need to test this header value.

    Infact..

    use strict; use warnings; use LWP::UserAgent; my $ua = LWP::UserAgent->new(); for my $url ( qw( https://perlmonks.org https://perlmonks.roma.it) ){ print "\nGET $url\n"; my $res = $ua->get( $url ); # ..yes you can $res->status_line to have both combined print "code :\t", $res->code, "\n"; print "message :\t", $res->message, "\n"; print "Client-Warning header:\t", $res->header( "Client-Warning" ) +, "\n"; } __END__ GET https://perlmonks.org code : 200 message : OK Client-Warning header: GET https://perlmonks.roma.it code : 500 message : Can't connect to perlmonks.roma.it:443 Client-Warning header: Internal response

    The message returned is already very clear Can't connect.. is oblviously client side: so why the choose of an error of the 5** class?

    In the chat LanX suggested 418 I'm a teapot and is fun and new to me, but not usable: teapots are reserved to IANA :)

    In the 4** class are defined status codes 401-418 plus 421 422 426 so there is room to have something like: 419 - Can't connect

    See also other status numbers used to craft a HTTP::Response

    So (and I dont want to blame LWP authors) why they choosed to return 500 setting an header internally to disambiguate it?

    What other frameworks do? Quickly trying Mojo::UserAgent I see it uses it's own Mojo::Message::Response and does not return any status code for unexisting urls:

    use strict; use warnings; use Mojo::UserAgent; my $ua = Mojo::UserAgent->new; for my $url ( qw( https://perlmonks.org https://perlmonks.roma.it) ){ print "\nGET $url\n"; my $res = $ua->get( $url )->result; print "code :\t", $res->code, "\n"; print "message :\t", $res->message, "\n"; #print "Client-Warning header:\t", $res->header( "Client-Warning" +), "\n"; } __END__ GET https://perlmonks.org code : 200 message : OK GET https://perlmonks.roma.it Can't connect: Host unknown. at testLWP500.pl line 10.

    ..and this error is defined in Mojo::IOLoop::Client it seems to me a better design, but... wait this is a die behaviour! if you switch URLs in the above code you never reach the second GET.

    By other hand curl tell us it is unable to resolve the URL:

    curl -I https://perlmonks.roma.it curl: (6) Could not resolve host: perlmonks.roma.it

    ..and it is right.

    What do you think about? What other frameworks I missed do?

    Is 200 if you post 203 but no 204 will be accepted! :)

    L*

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Types, objects, and systems, oh my!
No replies — Read more | Post response
by awncorp
on Sep 19, 2022 at 17:01
What if Perl had an OO standard library?
8 direct replies — Read more / Contribute
by awncorp
on Aug 23, 2022 at 07:39

    Programming in Perl is choices all the way down. An OO standard library would make it a lot easier to write Perl in a way that avoids having to come up with similar or the same solutions to common computing tasks, but, ... sustained object-orientation in Perl is difficult because the concept and mechanisms were bolted onto the language as an afterthought, and because it's optional, so one has to oscillate between this paradigm and others, i.e. some things are objects, most things are not, so when using Perl you have to constantly reaffirm your object-orientation. What do you think?

    http://blogs.perl.org/users/al_newkirk/2022/08/introducing-venus-a-new-world-for-perl-5.html

    "I am inevitable." - Thanos
Problems with String::ShellQuote
3 direct replies — Read more / Contribute
by afoken
on Aug 18, 2022 at 13:53

    I have bashed String::ShellQuote several times:

    Most times, it was because the module promises to solve a problem that simply disappears completely if you avoid the shell. See The problem of "the" default shell.

    Now, ovedpo15 came up with a problem that looks like a good reason to have a module like String::ShellQuote, and choroba proposed String::ShellQuote.

    The problem boils down to generate a shell script from perl that will be run by different a user, perhaps on a different computer:

    My Perl utility generates a bash script that consists of mkdir/rsync/cp commands. This bash script is later used by users (this means that I don't want to actually run those commands when my utility runs, rather just to generate the script).

    And, in an answer to a previous bashing, ikegami stated:

    You seem to allege some problem with shell_quote, but none of the linked post identify one. The all seemed centered around the idea of avoiding the shell is better. While true, that's not a problem with shell_quote.

    So, let's have a look at the source code of String::ShellQuote version 1.04, dated 2010-06-11.


    The module clearly states in "Bugs" that ...

    Only Bourne shell quoting is supported.

    Bourne is a large family of shells, but not every shell is a bourne shell. Also, not every default shell is a bourne shell. See https://www.in-ulm.de/~mascheck/various/shells/. Quite obviously, neither command.com from DOS and Windows nor cmd.exe from Windows are even vaguely similar to a bourne shell. The Almquist shell variants are very similar to bourne, but not exactly: https://www.in-ulm.de/~mascheck/various/ash/ The korn shells obviously aren't bourne shells, either.

    So, as stated by the author, you should not expect the result values of the various functions to be compatible with anything but bourne shells.


    With that out of the way, let's assume some bourne shell.

    A 7th edition Bourne shell surely is a bourne shell, right?

    There is a script that tries to find the version of your bourne compatible shell: https://www.in-ulm.de/~mascheck/various/whatshell/whatshell.sh.html. Did you notice something? There is also a commented version of that script at https://www.in-ulm.de/~mascheck/various/whatshell/whatshell.sh.comments.html. The very first explaining comment is this:

    : '7th edition Bourne shell aka the V7 shell did not know # as com +ment sign, yet.' : 'Workaround: the argument to the : null command can be considere +d a comment,' : 'protect it, because the shell would have to parse it otherwise. +'

    So, shell_comment_quote() should better use the null command followed by a single-quoted string so that the output works with the bourne shell.

    This is the documentation:

    shell_comment_quote quotes the string so that it can safely be included in a shell-style comment (the current algorithm is that a sharp character is placed after any newlines in the string).

    And this is the code:

    sub shell_comment_quote { return '' unless @_; unless (@_ == 1) { croak "Too many arguments to shell_comment_quote " . "(got " . @_ . " expected 1)"; } local $_ = shift; s/\n/\n#/g; return $_; }

    It does what is documented, but not every bourne shell will accept the output as comment. Oops #1.


    There are two similar functions wrapping the quoting backend function:

    sub shell_quote { my ($rerr, $s) = _shell_quote_backend @_; if (@$rerr) { my %seen; @$rerr = grep { !$seen{$_}++ } @$rerr; my $s = join '', map { "shell_quote(): $_\n" } @$rerr; chomp $s; croak $s; } return $s; }

    and

    sub shell_quote_best_effort { my ($rerr, $s) = _shell_quote_backend @_; return $s; }

    The backend function returns a reference to an error array and the quoted string. shell_quote() removes repeated error messages, and finally croak()s. The only reason for this overhead I can think of is to get a list of all errors at once instead of getting just the first error. shell_quote_best_effort() just ignores all errors and returns whatever survived the backend function. If errors occured, that may be plain wrong. At least, this behaviour is documented:

    This is like shell_quote, excpet [sic!] if the string can't be safely quoted it does the best it can and returns the result, instead of dying.

    Now, what errors may be returned by the backend function?

    sub _shell_quote_backend { my @in = @_; my @err = (); # ... return \@err, '' unless @in; # ... if (s/\x00//g) { push @err, "No way to quote string containing null (\\000) + bytes"; } # ... return \@err, $ret; }

    Yes, that's all. There is only one possible error. It does not like ASCII NUL, because ASCII NUL can not be passed as argument to programs. And because it does not like them, they are simply removed.

    Whenever shell_quote() throws an error, at least one of its arguments contained at least one NUL character. shell_quote_best_effort(), in the same situation, just silently damages your data. Oops #2.

    In all other cases, shell_quote_best_effort() behaves exactly like shell_quote().


    Now, let's look at the quoting:

    shell_quote(), which calls _shell_quote_backend(), is documented as following:

    shell_quote quotes strings so they can be passed through the shell. Each string is quoted so that the shell will pass it along as a single argument and without further interpretation. If no strings are given an empty string is returned.

    This is the code:

    sub _shell_quote_backend { my @in = @_; my @err = (); if (0) { require RS::Handy; print RS::Handy::data_dump(\@in); } return \@err, '' unless @in; my $ret = ''; my $saw_non_equal = 0; foreach (@in) { if (!defined $_ or $_ eq '') { $_ = "''"; next; } if (s/\x00//g) { push @err, "No way to quote string containing null (\\000) + bytes"; } my $escape = 0; # = needs quoting when it's the first element (or part of a # series of such elements), as in command position it's a # program-local environment setting if (/=/) { if (!$saw_non_equal) { $escape = 1; } } else { $saw_non_equal = 1; } if (m|[^\w!%+,\-./:=@^]|) { $escape = 1; } if ($escape || (!$saw_non_equal && /=/)) { # ' -> '\'' s/'/'\\''/g; # make multiple ' in a row look simpler # '\'''\'''\'' -> '"'''"' s|((?:'\\''){2,})|q{'"} . (q{'} x (length($1) / 4)) . q{"' +}|ge; $_ = "'$_'"; s/^''//; s/''$//; } } continue { $ret .= "$_ "; } chop $ret; return \@err, $ret; }

    Right at the start of the foreach loop, an undefined parameter is treated like an empty string and simply returns ''. Note that next jumps to the continue block at the end of the foreach loop. Personally, I would not accept an undefined value, because probably something went wrong in the caller if we get undefined parameters.

    Following that, NUL characters are killed, and data is damaged at the same time. See above. Personally, I would throw an error right here, NUL characters are a sure sign that something went wrong in the caller, and it makes no sense to continue.

    The next step following the first comment in the forech loop is explained in that comment. A feature rarely known to beginners is that you can set environment variables for just the invoked program by prefixing the program with a list of key-value pairs. FOO=1 BAR=2 baz answer=42 invokes baz with the environment variables FOO and BAR set to 1 and 2, and a single argument answer=42. If you want to invoke a program named FOO=1 instead, and pass it the arguments BAR=2, baz, and answer=42, you need to quote at least the first equal sign.

    The flag variables in the first if-then-else: $escape is reset for each parameter, $saw_non_equal is set as soon as a parameter does not contain an equal sign, and stays set. If an equal sign is found, and all previous parameters (if any) also contained equal signs, $escape is set, which forces quoting. This is not strictly needed: If the first parameter contains an equal sign and is quoted, it is taken as program name, and everything following will be read as arguments. So it would be sufficient to check the first parameter for an equal sign. On the other hand, it also does not hurt to quote every string that contains an equal sign, and it would make the code much simpler.

    The whitelist matching if: If the parameter contains a character that is (quoting the output of YAPE::Regex::Explain) any character except: word characters (a-z, A-Z, 0-9, _), '!', '%', '+', ',', '\-', '.', '/', ':', '=', '@', '^', the $escape flag is set. The intention seems to be to avoid quoting if not strictly needed. I'm not sure if all of those characters in the whilelist are harmless. At least in bash (which is a bourne shell), at least the '!' does have a special meaning in the first position:

    >bash --version GNU bash, version 4.3.48(1)-release (x86_64-slackware-linux-gnu) Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gp +l.html> This is free software; you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. >echo with some arguments ! with some arguments ! >foo with some arguments ! -bash: foo: command not found >'!' foo -bash: !: command not found >! foo -bash: foo: command not found >

    Note: the last example fails to find the "foo" command, not the "!" command. So "!" should better not be in that whitelist. Oops #3.

    The last if in the foreach loop: You want to escape if the $escape flag is set. Sure. But you also want to escape if the $saw_non_equal flag is not set, i.e. all previous parameters, if any, contained an equal sign, and at the same time, the current parameter also contains an equal sign. Do you remember this condition? A few lines above, the $escape flag was already set, depending on exactly this condition. This second condition is completely redundant. Belts and braces, or lost in code?


    The escaping: Singe quotes are replaced with the sequence '\'', which will end a single-quoted string, then add a single quote (quoted by the backslash), and finally begins a new single-quoted string. Ignore the next, long subsitution for now. $_ = "'$_'"; puts the partly-escaped string in a pair of single quotes. The next two substitutions s/^''//; and s/''$//; remove a leading resp. trailing empty single-quoted string. This may happen if the original parameter begins resp. ends with a single quote.

    The long substitution replaces a sequence of at least two escaped single quotes ('\'') by '", followed by a bare single quote for each orignal single quote, followed by "'. This works almost like '\'', ending a single quoted string, then adding a double quoted string of single quotes, and finally starting a new single quoted string. For an original parameter of two single quotes, this finally results in "''" instead of \'\', with every further single quote, the double quoted string will be shorter that the bashslashed string ("'''" instead of \'\'\').


    Joining the quoted strings: The foreach loop replaces the elements of @in with the quoted string in $_ ($_ is aliased to each element of @in). The continue block appends each quoted string and a space to $ret. Finally, chop $ret removes the last space. Is a simple join(' ',@in) too obvious?


    Combining Oops #3 and the suppressed quoting of equal signs:

    >bash --version GNU bash, version 4.3.48(1)-release (x86_64-slackware-linux-gnu) Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gp +l.html> This is free software; you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. >perl -MString::ShellQuote=shell_quote -E 'say shell_quote("!","FOO=BA +R","baz")' ! FOO=BAR baz >! FOO=BAR baz -bash: baz: command not found >'!' FOO=BAR baz -bash: !: command not found >! 'FOO=BAR' baz -bash: FOO=BAR: command not found >

    In ! FOO=BAR baz, the bash treats baz as the executable, as indicated by the error message, and FOO=BAR as extra environment for the executable.

    In shell_quote("!","FOO=BAR","baz"), ! should be the executable, simply because it is the first argument. Oops #3 prevents that it is quoted. Because the first parameter to shell_quote() does not contain an equal sign, escaping of equal signs is disabled for the remaining parameters. Oops #4.


    Summay:

    String::ShellQuote does not even work properly for bourne shells.

    Oops #1: Assumes every bourne shell accepts # for comments. Most of them do, but the ancient V7 bourne shell does not. Oh well. Document it as limitation and ship it.

    Oops #2: Silent damaging of data. IMHO not acceptable.

    Oops #3: Not quoting a character that will be interpreted by at least one a bourne shell (bash) if not quoted. IMHO not acceptable.

    Oops #4: Oops #3 may cause more missing quoting due to overly complex escaping descision. IMHO not acceptable.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
My Perl journey begins
7 direct replies — Read more / Contribute
by oldB51
on Aug 17, 2022 at 19:53

    My Perl journey began 48 hours ago. My Mac now hosts v 5.36.0 in HOME/localperl. I have discovered cpanm and used it to effortlessly install perltidy and perlcritic into HOME/perl5. What I thought was an easy system halted with attempts to install Padre and ptkdb debugger. Both installations appeared to go well with countless OKs on the way…then at the last test fail.

    I think I’m right in saying neither will in fact install on 64bit Macs. But - if this is the case - why does the installation begin. Surely cpanm knows what system it is trying to install into and should stop the process immediately with a polite message.

    Padre is unlikely to be a loss - I now have vscode set up for perl and it recognises v 5.36.0, perltidy and perlcritic. Red squiggles appear when I forget - so far deliberately - to end a line with a ‘;’. It is likely that my debug tactic will be to print variables at various stages until the problem is found. This is usually easier than a formal debugger anyway.

    The next stage of my journey will be working through Beginning Perl and Beginner Perl Maven. So far I’ve only dipped into them. The associated vids on the Perl Maven course are excellent introductions.

    I suspect I will soon be seeking advice from Perl Monks - so many thanks in advance.

Reading Perl documentation in Emacs
2 direct replies — Read more / Contribute
by haj
on Aug 08, 2022 at 13:53

    These days I came round to package an Emacs command which I am using since some time now: <M-x> perl-doc. It is a viewer for POD in Emacs, and it has been accepted in GNU ELPA, so it can be installed with <M-x> package-install.

    But why? There are already plenty of ways to read POD. They work (sort of), but I was not totally happy with any of them:
    • The perldoc command in a shell is nice. It knows what I've installed in my Perl, and I can add own projects to its search path by adding to PERL5LIB. It is not so useful with documents like the Moose::Manual which contain many cross-references.
    • Pod::Webserver is a nice way to get the same information in your browser. But it needs you to run two extra programs and does not provide an equivalent of perldoc -f split.
    • Pod::Perldoc::ToTk is supposed to display POD in a GUI, but the command in the synopsis perldoc -o tk Some::Modulename & fails with Undefined subroutine &Pod::Perldoc::ToTk::MainLoop called. I have Tk::Pod and Pod::Perldoc installed and don't want to chase that error.
    • Emacs has <M-x>cperl-perldoc which per default tries to get information for the thing where your cursor is, and displays the document in another window. But internally it uses man which isn't available on Windows, and until now I was too lazy to install any of the replacements (man for Windows, or woman.el. Also, it needs man pages installed, so I need to build those when I want to read documentation from sources I'm working on.
    • Emacs::PodMode (available via CPAN, not ELPA) is targeted for writing POD, it shows all the markup.
    • The Perl menu in CPerl mode still refers to Perl documentation in info format, which is no longer shipped with Perl and never was available for CPAN modules. Eventually these items should be deleted from the menu.
    • Edited to add (2022-08-09): In the meantime I found perl-pod-preview.el which also provides a man formatted view of POD. I like the fact that it works on (unsaved) buffers and might add a similar feature to perl-doc.el.

    So, trying to combine the good parts, <M-x>perl-doc defaults to the thing where the cursor is, respects PERL5LIB to find your POD (also accepts file names), does not need extra programs and has a decent formatting which allows to follow links between your documents.

How has your coding style changed over the years?
10 direct replies — Read more / Contribute
by stevieb
on Aug 06, 2022 at 21:42

    Since I started coding C and C++ in 2000, and Perl very shortly afterwards, my style hasn't fundamentally changed. If anything, I've simply become more pedantic about certain things. I'm bored so I thought I'd put a list together off the top of my head. What's your list look like?

    - Four space tabs!

    - I like no space between function parens:

    function($param1, \%param2);

    - I'm very much a K&R style person who likes 'else' type statements on their own line:

    sub function { if ($_[0] == 1) { do_something(); } else { do_something_else(); } }

    - When dereferencing a list, I always use whitespace:

    my @list = keys %{ $href->{thing} };

    - I always use the deref indicator where it is needed:

    my $thing = $hash{one}->{two}; my $other = $href->{one};

    - In my personal Perl code (60+ CPAN distributions), I always put the POD at the bottom of the file after all code (at $work, it's inline):

    package Blah; ... 1; =head1 NAME Blah - The thing with the guy and the place ...

    - I *ALWAYS* write unit tests before or at least during writing each function/method. I use the test scripts as test runs to prototype the code I'm writing instead of having a separate script. The test scripts are inherently part of the test suite. I also *ALWAYS* review my unit test suite for each piece of functionality and update it if necessary if patching/updating subs days, weeks or years later.

    - I (almost) *ALWAYS* write POD documentation as I'm writing the code (rarely before, but sometimes I do that too).

    - I frequently peruse the documentation of a random publication of my own software (regardless of language), and make fixes or produce relevant updates I may have missed.

    - I use the same editor all the time (vi/Vim) when I don't have my IDE handy, intelliJ IDEA (with vim support, of course). (VSCode for some of my work projects).

    - I rarely have to use the perl debugger, because I almost always find base issue cause through Data::Dumper statements. If I do use a debugger, it's more often for C code than it is for Perl code.

    - One of my favourite topics for new development is writing code to help other developers (including me). Introspection and unit test software is a big thing for me.

    - I love PPI, and am fluent enough with it that I rarely need to refer to the documentation when I want to use some of its functionality.

    - For my Perl code, I strive with great effort to achieve 95%+ unit test coverage, and that coverage generally covers that 95% 16 ways from Sunday. I often write additional code just so that tests can test code it just cant cover. This includes special constants, env vars etc. Some of my software has complete routines as add-ons just to get a single statement covered that otherwise couldn't have been.

    - I use Continuous Integration testing (CI) for almost everything. Mostly Github Actions (formerly Travis CI until they pissed me off tremendously), but some of my code can't run there, so I use my own Test::BrewBuild for such software.

    - I used to, but not so much anymore, review CPAN latest releases to try to find bugs to fix, even if its just documentation.

    - I am very specific about honouring other artist's copyright information. To further, I regard and honour the license of projects I've taken over from other authors. I'm a published author, poet, lyricist and music producer so the copyright thing in ingrained and imprinted. Appreciating other's art ownership isn't a desire to me, it's a core instinct.

    - I am diligent in acknowledging contributors to my software. My Changes files and commits generally have the persons name and/or CVS username embedded.

    - I take criticism very well; that said, I *ALWAYS* give credit where it is due, and *NEVER* claim credit for things I did not myself do

    - I take bug/issue/feature requests very seriously, and do my utmost to respond in as timely a manner as I humanly can (sometimes I don't, but that's very rare).

    - I use a bug tracker for almost everything I find myself; new features, real life bugs, security issues or even POD typos. If I'm perusing a random distribution of my own and I see a typo in the SYNOPSIS, I create a ticket for it.

    - I never use shift in an OOP module, I always use my ($self, ...) = @_;

    - I *ALWAYS* do parameter validation in each and every function/method.

    - I use pure perl OOP; very, very rarely do I ever use any of the helpers. The only time that happens is if I'm requiring a distribution that has that required already.

    - My POD format for subs is typically:

    =head2 method($param) Contemplates the reason for life, using common sense as a modifier. my $explanation = My::Continuity->new; my $thing = 'Reason for living'; my $reasoning = $explanation->method($thing); I<Parameters>: $param I<Mandatory, String>: The explanation of the formation of humanity in +a single string. I<Return>: Hash reference, where each key is a continent, and its valu +e is a colour. C<croak>s on failure.

    - I very sparsely use comments. Almost always my comments within code refer to *why* something is happening. It very rarely (if ever anymore) refer to 'what' is happening. The code says what is happening. Said comments when I make them are generally one to two lines only, and I use them only when I, myself will need to be reminded why in the fsck I did something so bizarre.

    I'm sure I can add a hundred other particulars I've formed over the years, but that's a start. How about you?

    Edit: Oh dear, I completely forgot. If it isn't blaringly obvious, Perl is my language of choice. Always has been, and I'm sure always will be. I'm decently fluent in C, C++, C#, wrote code in Python for four years as part of a job, can dabble my way through Javascript/JS, but I always lean back to Perl. Need an API for something (eg. Raspberry Pi)?, I'm making it available in Perl! New unofficial API for a new toy (eg. Tesla)? I'm ensuring it can be accessed with Perl! My priorities in life: My health, contentedness and happiness, my sobriety, my wife and children, Perl, everything else :)

RFC: A guide to installing modules for Win32 (2022 Edition)
9 direct replies — Read more / Contribute
by pryrt
on Jul 25, 2022 at 16:05

    Pursuant to Re^3: Can't locate Convert/BER.pm , here is a suggested version for "A guide to installing modules for Win32 (2022 Edition)"

    After any suggested updates (I don't pretend to be an expert, so feel free to correct and nitpick), would this go better as a reply to holli's original A guide to installing modules for Win32, or as a new top-level post in the Tutorials section?

    I included the case-sensitivity section because that was an issue in the recent thread Can't locate Convert/BER.pm, even though I'm not sure how general-use that note really is, or whether it really belongs in this tutorial.


    Updates: Here is a history of the edits made as a result of suggestions.

    1. Fix cpan arguments (per brian_d_foy's reply)
    2. Add perl Makefile.PL to the "standard recipe" (per syphilis's reply)
    3. Switch to a table, to avoid & (avoiding single & per many recommendations)
    4. Fix formatting syntax (per Discipulus's reply)
    5. Add a note about alternative build recipes (to cover Build.PL and possibly others)
    6. Add a note about incompatible modules
    7. s/skill/system/ in the last paragraph (per hippo's comment

    A guide to installing modules for Win32 (2022 Edition)

    Nearly two decades later, holli's excellent A guide to installing modules for Win32 could use some updated information.

    ActiveState phased out using PPM in 2021 (¹), so starting with ActivePerl 5.28, PPM is no longer included.

    And since 2008 (²), there's been an alternative distribution, Strawberry Perl, which comes with it's own gcc/g++ compiler and build environment. Modern Strawberry Perl versions not require using PPM for installation either (though it still ships with a PPM client, if you can find PPM repositories to use it with), so the original Win32 Guide's PPM instructions are not as useful as they once were.

    ActiveState's ActivePerl

    The modern method of "installing" modules on ActivePerl, as announced in The ActiveState Platform and Perl 5.32, is to make a binary build with the State Tool, where you tell it all the modules you need, and it will provide a binary build with all of those modules (and their dependencies) already installed.

    Strawberry Perl

    One of the benefits of Strawberry Perl is that they include a working gcc/g++ compiler and build environment, complete with a variant of make (dmake for older versions, gmake for newer versions; see perl -V:make to find out which your copy of Strawberry Perl uses) that means you can easily build and install modules similarly to how it's done on Linux and other OS:

    The default CPAN.pm client comes with Strawberry Perl, so installing Some::Module as easy as cpan Some::Module. Strawberry Perl also comes pre-installed with cpanm, an alternative CPAN client that handles dependencies, allowing installation with cpanm Some::Module. You can install cpanplus or other of the advanced CPAN clients on Strawberry Perl as well.

    Finally, if you are a traditionalist and want to manually build using the traditional recipe, you can look at the output of perl -V:make and then pick the appropriate variant of the recipe:

    traditionaldmakegmake
    perl Makefile.PL make make test make install
    perl Makefile.PL dmake dmake test dmake install
    perl Makefile.PL gmake gmake test gmake install

    Some modules may specify their own recipe for building and installing. If so, then try following their directions; the Strawberry build environment is pretty good. But if they specify a different recipe, it doesn't work, and the CPAN-client options don't work, you should file a bug report with the author, because any distributed module should be installable using cpan, cpanm, cpanplus and the like.

    For any of these installation techniques: If you have an installed copy of Strawberry Perl, your path should point to the Perl and C binaries already; if you have a portable copy of Strawberry Perl, you may need to run portableshell.bat to get the environment set up correctly.

    Other Alternatives

    If you have built your own Perl, or for Windows Subsystem for Linux, or cygwin, or other Windows tools that provide bash or bash-like environments, you should still be able to follow the instructions in the original A Guide to Installing Modules for installing, or use cpanm or cpanplus or other clients not mentioned in the original guide.

    Win32 Caveat: Module Case Sensitivity

    File names on Windows are not case-sensitive, so some Windows users are used to typing PATHS IN ALL CAPS. Do not type module names in all caps, even when using a CPAN client from the Windows command line (cmd.exe or powershell), as Perl and the CPAN tools will not treat SOME::MODULE and Some::Module the same, even if they resolve to the same ...\Some\Module.pm file.

    Windows Incompatibility

    Please note that some modules have been created in such a way that they are incompatible with the Windows operating system. This guide cannot help you install and use a module that is not compatible with your OS. (You can check the CPAN Testers reports linked from the metacpan.org page for each module: if it doesn't show any passing results on the mswin32 platform, you may have difficulty installing the module.)

I don't like annotation syntax
3 direct replies — Read more / Contribute
by Liebranca
on Jul 13, 2022 at 06:25

    Hello,

    By 'annotations' I mean 'attributes'. And by 'do not like', 'hate' and 'utter disgust that makes me wish for my own unborn demise' I actually mean it just doesn't click right (hey, that's a way to put it).

    And I wonder if anyone actually likes the :annotation(:attribute) syntax because I seldom see it outside of a *certain* OOP proposal that at least to me smells an awful lot like weird C++ fetish roleplay during which someone must've forgotten the safe word.

    Alright, hey, listen. I know that's too harsh. But if I want weird octopus dog for dinner, then I'll just have that. A 'class' keyword and all derived farts incumbent on the ever philosophizing theorymongers of objectification, I really do not care for.

    So why am I going on about this? I'm writing a preprocessor. It doesn't really introduce new syntax, just fake attributes that are removed before the compiler even gets to see them.

    The purpose of these attributes is simple: marking subroutines for inlining and variables as blessed references to one or another package -- just because I need to know where to pull a definition from before I get to inlining anything. It's a pinky-promise, not a typecheck.

    But this syntax...

    my $eyes :bleed = '<-whenever I see :this and :that';

    ... is frankly not my cup of tea. But I *do* wonder who likes it, if anyone, and why. I mean, I get it. I'm an outcast, a widower and I'm pissed. To say I'm out of touch is an understatement. So alright, I'm willing to listen to reason. Where is reason?

    I can transform an entire file, and therefore am entirely in possession of the arcane power to change the language itself into hexspeak spaghetti; I'd much rather work on something other people can read. Or more like, I'd much rather work on something that doesn't require you to basically relearn the language in order to use it.

    It must FEEL like it's still the same thing. That is why I care to do this:

    use inline; MyPackage::MySub(); # contents of sub will be inlined no inline; MyPackage::MySub(); # will not be inlined

    ^because it's what feels more natural, more Perl-esque, far as I'm able to tell. It's the same to perl the binary because the preprocessor in question strips those two use/no lines anyhoo, so they could just be whatever I say. I could just enforce Turbo C on 16-bit Windows rules because that's how I was taught way back when and it makes me feel nostalgic to #include <conio.h> and define functions inside a macro just because I can. Wouldn't that be silly of me?

    Yes, it would be silly. Sorry, this is a lot of ranting. I'm trying to think of a better way to make the pinky-promise to the preprocessor that a given variable is a blessed ref that isn't just simply adding the pinky-promise keyword and calling it a day. This is driving me crazy.

    If people feel annotations for this are alright then I must defer to their judgement. Mostly because I can't think of anything better that doesn't look like C. And that's that, you know.

    Have a good day.

Good day fellow monks. Back after a while
3 direct replies — Read more / Contribute
by Charles_okhumode
on Jul 13, 2022 at 05:31

    It's been a while i have being a member of perl monks. I have being on and off in my participation in here. I guess administrators would have picked up on my in ability to recaĺl my username, password,and email to my previous account.

    This is because i have being in and out of hospital and the mobile number i was using in the past for the email account i misplaced it and the sim pack. Making it difficult to retrieve the number.

    Now i am slowlly recovering and sooner rather than latter ill be back on my foot and able to be more active in here.

    I consider myself an intermediate perl programmer and have a working knowledge of unix/linux. I am also comfortable in systems asministration. My goal now is to major on web application development using perl.

    I have began my search for the right MVC i can use for this. I know there are quite a few. Please your guidiance and mentoring will be gladly welcomed.

    I hope my time in here will be fun and relatively peaceful. I wish to have a friend whom i can grow and share my learning experience with.

    Thanks and God bless.


Add your Meditation
Title:
Meditation:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":


  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (5)
As of 2022-12-05 12:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?