Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

RFC: A Primer on Writing Portable Perl Programs

by yumpy (Sexton)
on Nov 01, 2006 at 08:09 UTC ( [id://581633]=perlmeditation: print w/replies, xml ) Need Help??

NOTE: I felt the need to start a meditation on this topic to collect ideas from the community, because:

A) I see a strong need for specific guidelines on how to write Perl for portability, as an alternative to hoping that "Perl will just do the right thing" (which ain't always so), and

B) Although I've been writing Perl programs for Unix (Linux, etc.) systems for nearly a decade, my non-Unix experience with Perl is very limited--so I'm eager to tap into the Wisdom of the Monastery for the benefit of improving this Meditation to the point where it becomes a Tutorial.

Patches welcome! 8-}

A Primer on Writing Portable Perl Programs

 

Tim Maher, Consultix

tim@TeachMePerl.com

 

Perl is rightfully famous for being Operating System (OS) “portable”. This means that the Perl language itself can run on a wide variety of OSs, which is certainly a nice feature. But more importantly, it also means that Perl programs written with OS-independence in mind can generally be transported from one OS to another, and successfully run there, without any changes—and that's a fantastic feature.

In this tutorial, you'll learn the basic principles for converting Perl scripts and one-liner commands written for Unix1 systems into forms usable on other systems. Although the general portability issues we'll discuss apply to all OSs, we'll concentrate on Windows as our example of a non-Unix OS, because of its widespread availability.

On Windows, individual Perl scripts can be run by clicking their associated icons on the graphical display, but Perl one-liners need to be submitted to the Windows “shell” (cmd.exe). For simplicity, we'll restrict our focus to this shell environment for running both scripts and commands. It's accessed by clicking on Start, followed by Run, and then typing cmd, which causes a "DOS"-style terminal window to appear, with a typical prompt of C:\.

Specific instructions for running Perl programs on systems other than Unix and Windows (VMS, Mac, etc.) may be found in the reference documents cited in section 3, “Additional resources”.

Next we'll discuss the major techniques used to make programs OS portable.

1. Programming for OS independence

In order to reap the benefits of Perl's capacity for OS portability, you must avoid writing code that depends on OS-specific resources or conventions. We'll consider the two most important cases here, having to do with OS-specific commands and pathnames. Additional problem areas are discussed in the resources listed in section 3.

1.1 Avoiding OS-specific commands

Although Perl is a very clever language, it's understandably incapable of executing a Unix-specific statement like this one on non-Unix OSs:

! system "who | grep '$ENV{LOGNAME}' > /dev/null" or
warn 'HELP! I'm not here!';

What's non-portable about this statement? It requires its host OS2 to have Unix-like who and grep commands, a LOGNAME environment variable, the > redirection symbol, and the /dev/null special device—not to mention command exit codes whose True/False values are opposite to Perl's (which accounts for the !).

But the above example is a rather extreme case; in practice, most of the statements in an average Perl program would likely run on other OSs without change, and many of the remainder could easily be rewritten for OS portability.

Consider this call to system, which is OS specific indeed, yet still amenable to rehabilitation:3

# Sample output from "date": Wed Mar 22 16:59:38 PST 2006
# Print date stripped of trailing "<SP>TIMEZONE<SP>YEAR"

! system "date | sed 's/ A-Z{3} 0-9{4}\$//'" or
        warn "$0: command failed!";

There are two major strategies for avoiding OS-specific statements like this one in order to make your code OS portable. The first is to use OS-independent resources, such as features built-in to Perl or available from modules, in preference to OS-specific ones.

Considering the above example from this perspective, Perl's built-in localtime function could be used to generate a string that's very similar to date's output. The only difference is that it lacks the timezone information, which isn't an issue because it's being discarded anyway. In addition, all of sed's functionality (and more) can be obtained from the use of Perl's built-in substitution operator. These observations allow us to replace the original Unix pipeline with this native Perl code:

$date=localtime;
$date =~ s/ \d{4}$//; # delete "<SP>year"
print $date;            # newline provided by -l invocation option

This produces output identical to that of the date | sed pipeline, but in an OS-portable manner. As an added bonus, the elimination of the Shell-level command pipeline obviates the need for handling its exit code in an OS-independent manner.

The second major strategy for writing OS-independent code is one you should strive to avoid using. This technique involves writing separate chunks of OS-specific code in different branches to handle the OS differences, using Perl's special host-OS-reporting variable, $^O, to select the appropriate branch for execution.

This approach leads to code that takes this form:4

if ($^O eq 'MSWin32') {
    # Do Windows-OS specific stuff
}
elsif ($^O eq 'darwin') {
    # Do MacOS/X specific stuff
elsif ($^O eq 'linux' or $^O eq 'solaris') {
    # Do Linux/Solaris specific stuff
else {
    warn "$0: WARNING: Program might not work on '$^O'\n";
}

For example, the following branches use OS-specific commands to display a long-listing of the file named in the first argument:

if ($^O eq 'MSWin32') {
    system "dir $ARGV[0]";
}
elsif ($^O =~ /ix$/) { # matches our Unix-like OSs: AIX & IRIX
    system "ls -l $ARGV[0]";
else {
    warn "$0: WARNING: Program might not work on '$^O'\n";
}

Techniques based on use of the built-in stat function could provide an OS-portable alternative to the use of these OS-specific commands. But even more conveniently, you could probably find an OS-portable CPAN module that has already solved this problem, and use its resources instead of your own (see http://search.cpan.org).

Another major source of portability problems is programmers making certain kinds of assumptions about other OSs that may be invalid. We'll see how to avoid making unnecessary assumptions about file-system related differences next.

1.2 Avoiding OS-specific pathnames

There are contexts where Perl expects to find pathnames, such as within @ARGV in programs using the n or p options, and as arguments to the stat function. In such contexts, Perl automatically converts slash separators in pathnames into backslashes—if that's appropriate for the host OS. This means you don't have to code separate branches of execution for different OSs (as shown earlier) just to handle that chore.

For cases where Perl can't know in advance that a pathname will be present, such as within the argument for the system function, it's your responsibility to arrange for the slash-to-backslash conversion—along with any other OS-required changes.

For programmer convenience, Perl provides a standard module to help you perform OS-specific pathname conversions, called File::Spec::Functions.

You also must avoid making unfounded assumptions about other OSs, such as whether a particular directory (e.g., /tmp) will necessarily exist there. File::Spec::Functions helps with this task too, by providing a tmpdir function that returns the name of the counterpart for /tmp on the host OS.

The additional resources cited in section 3 discuss many other important portability issues, along with specific recommendations for dealing with them.

Having covered some important theoretical concerns in "programming for OS portability", we'll now discuss some specific recommendations for making Unix-bred Perl programs portable to Windows systems.

2. Running Perl programs on Windows

We'll begin by discussing the basic techniques for running Perl one-liner commands and Perl scripts on Windows systems.

2.1 Running Perl one-liners on Windows

Many programmers find that the use of Perl one-liners increases their productivity greatly, and Windows users are no exception. However, the vast majority of the one-liners shown in most Perl books will not work if typed, for example, to a Windows cmd.exe shell. That's because the single quotes they use to convey the program code to the perl command are not recognized as quoting characters by that shell.

There are two ways to address this problem:

·   convert the quoting techniques used in the one-liner for compatibility with the target OS's shell5

·   convert the one-liner to a script that runs on the target OS

The first solution requires modifying the command's quoting in an OS-dependent manner, while the second avoids the code-quoting issue altogether by enclosing the program code in a file, which will only be read by Perl.

Let's look at a specific example of reworking a command's quoting for compatibility with Windows 2000 or XP (which share the same shell). Consider the following one-liner:

perl -wl -e 'print "Crikey, what a little beauty!";'

This command won't run properly on the Windows systems mentioned; here's the error message from Perl:

Can't find string terminator "'" anywhere before EOF at -e line 1.

That message indicates that Perl did not receive the complete program as the argument for -e, which is a by-product of Windows not treating single quotes as quoting characters.

In trivial cases like this one, where either type of quotes will work around the Perl string, simply swapping the internal double quotes with the external single quotes can fix the problem. That's because double quotes, unlike single quotes, are recognized as quoting characters by Windows, permitting this reworked command to work as intended:

perl -wl -e "print 'Crikey, what a little beauty!';"

In cases where double quotes must be used within the Perl program itself, backslashing allows them to coexist with the shell-level outer double quotes:

perl -wl -e "print \"The arguments are: @ARGV\";"

Next will discuss techniques for making scripts more portable.

2.2 Running Perl scripts on Windows

Assuming a script has been written with OS portability in mind as described above, it needn't take much work to get it to run on a non-Unix system.

For instance, on a Windows machine that has a working Perl installation,6 invoking a script as:

C>\ perl myscript

should be sufficient to run it (assuming the shell is properly configured to know how to find perl).

But the more typical approach is to add a Perl-specific file extension to each Perl script, to allow Windows to invoke perl on it automatically when you type its name to the shell prompt:

C>\ myscript.pl

The association between the .pl extension and Perl (or the .plx extension) is generally created at the time Perl is installed, but if you need to set that up yourself, the instructions are provided in perldoc perlwin32.

      When you run scripts using either technique shown above, any invocation options provided on the script's Unix-oriented shebang line will be recognized and put into effect by the perl command. For this reason, you should leave your shebang lines in place when you transfer scripts from Unix to other OSs, despite the fact that they won't be used to locate the perl command itself (as they do on Unix).

Although getting your scripts to execute on Windows should not be a problem, obtaining the benefits of certain services provided by the Unix shells, which are far more sophisticated than their Windows counterparts, may not be so easy.

For example, let's say you wanted to supply filename arguments to a script using “wildcard” characters, as in this Unix command:

$ myscript *.txt

Special techniques would have to be used to arrange for *.txt to be processed properly, as detailed in perldoc perlwin32 (search for “Wild.pm”).

3. Additional resources

For additional information on writing Perl programs with OS portability in mind, and for running Perl commands on non-Unix OSs, you may wish to consult these resources:7

  • perldoc perlport                   # General portability issues
  • perldoc perlwin32         # Windows-specific portability issues
  • perldoc perlos2               # OS2-specific portability issues
  • perldoc perlmac               # Mac-specific portability issues
  • perldoc perlvms               # VMS-specific portability issues
  • perldoc perlmacosx       # Mac OS X-specific portability issues
  • perldoc File::Spec::Functions    # Useful portability functions
  • perldoc File::Spec         # Functions of File::Spec::Functions
  • perldoc File::Spec::Unix          # Unix-specific pathname info
  • perldoc File::Spec::Win32      # Windows-specific pathname info
  • perldoc File::Spec::Mac     # Mac-specific pathname information
  • perldoc File::Spec::OS2     # OS2-specific pathname information
  • perldoc File::Spec::VMS     # VMS-specific pathname information
  • perldoc perlrun          # Invocation options and shebang lines


1 For our purposes, the term “Unix” refers to actual UNIX systems as well as functionally similar OSs such as Linux and Mac OS/X’s FreeBSD.

2The "host" OS is the one that the program is running on.

3Because the $ character within the double quotes seems to be introducing a request for the interpolation of the variable $/, the $ needs to be backslashed to be treated as a literal character by Perl.

4See man perlport for the name strings that Perl uses for other OSs, such as “MSWin32” for 32-bit Microsoft Windows systems.

5A "target" OS is one on which the program is intended to run.

6At the time of this writing the Activestate corporation (see http://activestate.com) was still the undisputed vendor of choice for high quality and freely available versions

7When you're on a Unix system, you could also use the man command to access these documents; however, on another OS, which you're likely to be visiting when you refer to this page, perldoc would be the appropriate command to use.

* Tim Maher, CEO, Consultix | tim@consultix-inc.com *
  • Comment on RFC: A Primer on Writing Portable Perl Programs

Replies are listed 'Best First'.
Re: RFC: A Primer on Writing Portable Perl Programs
by xdg (Monsignor) on Nov 01, 2006 at 12:35 UTC

    First, thanks for taking on this subject. Working on Vanilla Perl has been an interesting lesson on just how many common Perl modules weren't written for portability. However, the meditation focuses a lot on Windows, so I'd either consider broadening the OSes covered or else changing the title.

    That said, I found version 0 of the tutorial to be substantially less informative than perlport. If you're looking for a gentler introduction, I'd suggest taking perlport as the base and then translating it into something that is easier for a less experienced programmer to understand. There's a lot more to portability that what you've covered above.

    I suggest you look at win32.perl.org -- there's a good deal of information there that will be of use. Search for "Problem Modules" and you can see a large variety of real-world portability problems in CPAN modules, some of which have been fixed and many of which have not.

    Some other random thoughts that occurred to me in reading the tutorial:

    All that said, I look forward to the next version.

    -xdg

    Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

      xdg wrote on 1 Nov 2006:

      Modules I've found helpful: IPC::Run3, File::HomeDir, ExtUtils::Command, Probe::Perl ...

      I have to somewhat emphatically add File::Save::Home to this list, as contrasted with File::HomeDir, which arguably does not really do the right thing on MS Windows.

      At the very least people need to read the POD for each module, study the APIs and rationales and make an educated choice.

          Soren A / somian / perlspinr / Intrepid

      -- 
      Words can be slippery, so consider who speaks as well as what is said; know as much as you can about the total context of the speaker's participation in a forum over time, before deciding that you fully comprehend the intention behind those words. If in doubt, ask for clarification before you 'flame'.
Re: RFC: A Primer on Writing Portable Perl Programs
by brian_d_foy (Abbot) on Nov 01, 2006 at 09:03 UTC

    You say to avoid using $^O, but then recommend using File::Spec, which uses $^O to do what you said not to do. It's not a bad technique. At some point you do have to get down to brass tacks and interact with the operating system, or tell Perl how to behave for the particular operating system.

    --
    brian d foy <brian@stonehenge.com>
    Subscribe to The Perl Review

      True, but I think the point he was trying to make is avoid using $^O whenever possible. First off avoid using it in your own code when a suitable CPAN module exists. I'd say the second step would be to abstract out your use of it into your own independant module to localize the OS specific-ness to as few places as possible.

      Nothing is worse than having huge if/else trees of OS specific code littered throughout a program.

      Frank Wiles <frank@revsys.com>
      www.revsys.com

        I agree completely! That is the point I was trying to make.
        *=========================================================================*
        | Dr. Tim Maher, CEO, Consultix
        | Email: tim@consultix-inc.com
        *=========================================================================* 
        
      You say to avoid using $^O, but then recommend using File::Spec

      I took the original advice as another "don't reinvent the wheel" caution. I.e. look for a module that already does things portably rather than try to write it again from scratch. It's a thought that applies more for program authors than for module authors, who do need to concern themselves with OS-specific behaviors if they're planning on distributing things to CPAN.

      -xdg

      Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

      To extend that, you could always set up a collection of dispatch tables in your own code, and simply select the appropriate mapping using $^O. Then you code using your SKU mapped functions. It might not be the best, but you could always make it a module, and use it where needed.

      Just a thought,
      -v.

      "Perl. There is no substitute."
      The point of avoiding explicit use of $^O was to prevent cluttering up the code, which isn't an issue when accesses to that variable are confined to a module. But I agree that you can't always avoid OS-specific code, so perhaps the word "minimize" might be a better choice than "avoid".
      *=========================================================================*
      | Dr. Tim Maher, CEO, Consultix
      | Email: tim@consultix-inc.com
      *=========================================================================* 
      
      Don't forget that File::Spec is a core module, tested and maintained to run on all platforms on which Perl is supported.
      You say to avoid using $^O, but then recommend using File::Spec, which uses $^O to do what you said not to do.

      His advice was directed at the user of the File::Spec module, not the author of the File::Spec module. If the File::Spec module is any good (as I understand it is), the end user can simply use it without knowing exactly how it works.

      It's only when a module doesn't work (in the way you want it to) that you're reduced to groveling through the tedious little implementation details to fix whatever's broken.

      At some point you do have to get down to brass tacks and interact with the operating system, or tell Perl how to behave for the particular operating system.

      "Avoid <X>" doesn't mean "don't do <X>". It means "don't do <X> if there's another way of doing things". If there's no other way of doing things, his advice (and your criticism) doesn't apply.

Re: RFC: A Primer on Writing Portable Perl Programs
by rinceWind (Monsignor) on Nov 01, 2006 at 11:39 UTC

    Perl portability is a bigger subject than your meditation gives credit for. Although to my mind you have just scratched the surface, I fully approve of your meditation, and any call to arms on portability.

    I have spoken about this subject at several YAPC conferences, workshops and perlmonger tech talks, see http://www.ivorw.com/talks/perlport.ppt. You're welcome to use any material from these slides, suitably acknowledged. You can quote the URL as I have no plans to move it. Also, if you have any feedback or corrections, please let me know.

    Some specific comments

    Section 1.2 is not correct - it's not just a case of the direction of your slashes, but whether your filename is using native syntax or POSIX syntax. Have a look at VMS filenames for a bizarre native syntax.

    When it comes to running perl scripts and one-liners on Windows, pl2bat is worth a mention, as is PAR. Also, on Windows I nearly always turn off the file association between .pl and perl, as usually when browsing directories and websites, I don't want to run a .pl, I want to look at it.

    --

    Oh Lord, won’t you burn me a Knoppix CD ?
    My friends all rate Windows, I must disagree.
    Your powers of persuasion will set them all free,
    So oh Lord, won’t you burn me a Knoppix CD ?
    (Missquoting Janis Joplin)

      Good comments, and lots of good information in the PowerPoint presentation! Thanks for the input.
      * Tim Maher, CEO, Consultix | yumpy@consultix-inc.com *
      on Windows I nearly always turn off the file association between .pl and perl
      For a different approach (motivated by spending more time on the cmd prompt than in explorer windows) I extend the PATHEXT (system) environment variable with a ";.PL" to be able to call my Perl scripts by name without extension.
Re: RFC: A Primer on Writing Portable Perl Programs
by mirod (Canon) on Nov 01, 2006 at 11:37 UTC

    Just one comment: I found that using the alternate quote syntax makes it much easier to write one-liners, portable or not:

    perl -wl -e 'print qq{The arguments are: @ARGV};'

    Then if you want to use this on Windows, change the outer ' to ", et voilà! No extra-escaping, no having to figure out whether there are interpolated variables in the string or not... It Just Works (tm)

      That's a good point, but at the expense of doubling the number of 'quoting' characters that needs to be typed ("x" vs. qq{x}), I don't think it would catch on as a routine practice. However, as a "best practice" for depicting one-liners meant for portability, I like it!
      *=========================================================================*
      | Dr. Tim Maher, CEO, Consultix
      | Email: tim@consultix-inc.com
      *=========================================================================* 
      
        doubling the number of 'quoting' characters…("x" vs. qq{x})

        In the realm of your discussion it's (\"x\" vs. qq{x}) which makes the score even.

        I don't think it would catch on as a routine practice.

        It is routine practise, but that may be only me. For one I don't like "external" characters inside my perl code (the backslash belonging to the cmd.exe, and is not seen by perl), and secondly it's actually easier to parse when read (for my brain at least).

        Update: The original proposition from mirod came from the -ix side. My remarks are coming from the Win command line ;-)

Re: RFC: A Primer on Writing Portable Perl Programs
by Anonymous Monk on Nov 01, 2006 at 10:06 UTC
    In trivial cases like this one, where either type of quotes will work around the Perl string, simply swapping the internal double quotes with the external single quotes can fix the problem. That's because double quotes, unlike single quotes, are recognized as quoting characters by Windows, permitting this reworked command to work as intended:
    perl -wl -e "print 'Crikey, what a little beauty!';"
    First of all, which kind of quotes you should use isn't a matter of the OS. It's a matter of the shell you are using. And there are good reasons to use single quotes in Unix shells. Because, even in trivial cases as your example, using double quotes won't work as you may expect. See, Perl borrowed a lot from Unix and the shell. A whole lot. Including the difference between single and double quotes. Running your example in bash, a not uncommon Unix shell, gives:
    $ perl -wl -e "print 'Crikey, what a little beauty!';" -bash: !': event not found
    And in csh:
    $ perl -wl -e "print 'Crikey, what a little beauty!';" ';": Event not found.
    And in zsh:
    $ perl -wl -e "print 'Crikey, what a little beauty!';" zsh: no such event: 0
    For this example, using double quotes works fine in the Bourne shell, ash, ksh, and tcsh. Hence, the use of double quotes instead of single quotes isn't portable between shells on a Unix system - good enought reason in itself to use double quotes. Had the example contained an actual scalar variable, the use of double quotes in any Unix shell would have caused a problem: just like in Perl, Unix shells interpolate inside double quotes, and don't interpolate inside single quotes.

    So, given 8 shells, sh, ash, bash, ksh, zsh, csh, tcsh and the standard Windows shell, use of single quotes make my one-liner work on 7 of them. Use of double quotes is going to fail on several of them if the command line contains an exclaimation mark, and is going to fail on 7 of them if it contains a scalar variable. Now, if my goal was to maximize portability, I'd pick the solution that works 7 out of 8 times instead of 1 out of 8 times.

    And guess what? Various shells, including bash, have been ported to different operating systems, including Windows. And having a Bourne like shell is a requirement for POSIX compliance anyway.

      Of course, portabit over 7 platforms is only half of the story, as the eighth platform accounts for about 90% of all deployed machines ;)

      (and yes, that number is pulled out of the air and just exists to illustrate the different goals of portability)

      Update: McDarren spotted a typo

        Yes, but of the 90% of all deployed machines, over 99.99% will never have a user issueing Perl one-liners.

        To do statistics in a meaningful way, the number of deployed machines with a certain shell isn't relevant. You'd have to look at the number of users (where a single person working on N platforms counts as N different users) that issue Perl one-liners. And I'm pretty sure that more than 10% of them run non-Windows shells.

        Not to mention that Unix shells have been ported to Windows, and I do know people using Unix shells on Windows (myself included). I've never heard of the Windows shell having been ported to Unix, never mind any one actually running the Windows shell on Unix.

        Oh, and MacOS has gone Unixy as well, so that's another platform that will have a Bourne shell compatible shell available.

        As for VMS and other exotic OSses, my knowledge about them is too limited to comment.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://581633]
Approved by Corion
Front-paged by McDarren
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (8)
As of 2024-03-29 15:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found