While 99.999% of Perl is completely portable between the various platform implementations (Unix, Win32, Mac, etc), there are a few gimmes and gotchas that always seem to crop up in your application. I've been keeping a running tally in my head, but figured that I should write them down, seeing as a number of posts in the past two months have ended up centering on these issues.

If you know of any that aren't listed here, please either msg me or reply and I'll update. Also, if I've got something wrong or incomplete, tell me as well.

  1. The biggest one that trips people up is newlines. On Unix, a newline is a \n. On Win32, it's \r\n, and on Mac it's \r. chomp will only use what's in your $\ variable, so if you're on Unix and reading from a file FTP'ed from a Win32 machine, you'll miss the \r. To fix this, do the following:
    my $line = <SOME_FILE>; chomp($line); $line =~ s/\r$//;
  2. On Unix, select will check STDIN and STDOUT, as implied by the perldoc. On Win32, it will not.
  3. Let's say a SIG_INT signal triggers some handler. The handler doesn't do anything and simply ends, returning execution back to where it was before. If another SIG_INT is triggered, Unix will handle it by calling the handler again and Win32 will simply die. (Does this mean that Win32 requires the SIG_INT handler to be re-set?)
  4. Win32 requires that binmode is called before writing binary data files. Unix does not. (submitted by grinder)
  5. Be careful with case-sensitivity. Unix is case-sensitive, but Other OS'es may not be, such as DOS and Win32. (Win32 remembers capitalization, but A.txt and a.txt aren't allowed in the same directory.) (submitted by arhuman)
  6. Be careful with meta-characters in your filenames. Unix and Win32 both accept spaces in their filenames. DOS (for example) does not. Unix accepts most metacharacters, like -$#@^, etc. Win32 may or may not. Other filesystems most definitely will not.
  7. Certain filenames may or may not cause troubles. These include CON, PRN, AUX, and the like. (submitted by arhuman)
Update: Given that arhuman kindly pointed out perlman:perlport and perlman:perlport2, I'll point people there as well. perlman:perlport is a discussion of the broad differences between the various platforms and perlman:perlport2 is a (rather terse and unenlightening) discussion of the various functions and how they're implemented (or not) on said platforms. I do think that a discusion of just what the differences mean to a given programmer is important.

For example, perlman:perlport2 says that select is implemented only on sockets in Win32. What does that mean to you? Why would you want to select on non-sockets? That's what this Meditation was meant to be about. Update2: agent00013 remarked that Win32 is not case-sensitive.

------
/me wants to be the brightest bulb in the chandelier!

Replies are listed 'Best First'.
(tye)Re: Platform Differences
by tye (Sage) on Aug 06, 2001 at 21:19 UTC

    I'll disagree with some of the points provided so far.

    • Use binmode on binary files regardless of platform. The fact that it might be a no-op on some platforms doesn't make it a portability problem. It is just not using it when you should that is a portability problem.
    • Avoid signal handlers in Perl. Work has begun on making them reliable, but it isn't available yet.
    • Newlines only cause problems when files are copied between platforms improperly. This again isn't a problem with writing portable scripts. It can be a challenge when writing scripts that tolerate inappropriate newlines no matter what platform they run on.

    My suggestions on porting are more along the lines of "use modules" first, Perl second, and avoid everything else. So use IPC::Open3 not fork. Use File::Copy and mkdir not system. Use readdir not qx.

            - tye (but my friends call me "Tye")
      Newlines cause other problems than just cross-platform transfert.
      As read in perlport :

      Due to the ``text'' mode translation, DOSish perls have limitations of using seek
      and tell when a file is being accessed in ``text'' mode.
      Specifically, if you stick to seek-ing to locations you got from tell (and no others),
      you are usually free to use seek and tell even in ``text'' mode.
      In general, using seek or tell or other file operations that count bytes instead of characters,
      without considering the length of \n, may be non-portable.

      If you use binmode on a file, however, you can usually use seek and tell with arbitrary values quite safely.



      "Only Bad Coders Code Badly In Perl" (OBC2BIP)

        For what it's worth, I don't consider this a problem with newlines either. That is just a standard limitation of C's seek() for non-byte-stream files. I never use seek to skip byte offsets. MacOS has a different newline but has byte-stream files. Other operating systems don't even have newlines and don't have byte-stream files and so tell doesn't give you a byte offset anyway (if gives you a record number and an offset within that, for example).

                - tye (but my friends call me "Tye")
Re: Platform Differences
by arhuman (Vicar) on Aug 06, 2001 at 19:52 UTC
    From Memory :
    • Newlines causes trouble (\n is different among platform and catches 1 or 2 char...)
    • Case sensitivity for filenames can screw your unicity tests...
    • Some filenames can cause trouble on some OSes
      Avoid CON, PRN, AUX...
      (In fact only one OS was dumb enough to crash when accessing a c:\con\con ;-)

    But probably a lot more in perlport

    "Only Bad Coders Code Badly In Perl" (OBC2BIP)
Re: Platform Differences
by John M. Dlugosz (Monsignor) on Aug 07, 2001 at 02:18 UTC
    Newlines and chomp is not a big problem, since normally (non-binary) files automatically normalize the newline character to a single "\n" anyway. For most scripts, the difference is invisible.

    And even among a single platform, is a newline character "\n", "\N{LINE SEPARATOR}", or "\N{PARAGRAPH SEPARATOR}"? (It's ironic that the Unicode documentation for the latter two use the term "unambiguously").

    The ActiveState documentation includes sections on what's different re the built-in functions and semantics.

    Windows has COM/OLE, and the Registry. Porting programs to *nix will have to overcome this lack and approach problems in different ways (e.g. there is a portable Word document reader; is there a Perl interface for it?).

    Re case sensitivity: a.txt and A.txt are allowed in the same directory on NTFS (and possibly network-mounted file systems). But you'll have problems referring to them without using tools compiled with the POSIX subsystem, or special flags passed to CreateFile (the low-level open API) which Perl doesn't do.

    Paths: On Windows, it's a backslash. But slash is taken as an alias most of the time. Perl programs commonly use a slash and ignore this. But what if you have a filename with a backslash in it? Likewise, on a Mac, a file named "notes 12/23/99" is legal--what's Perl going to do with that?

    Special characters: In Win32, only backslash, slash, and colon are special. No big deal about the rest, just like in *nix.

    Here's a new one: file permissions models differ.

    Details of globbing varies.

    In *nix, globbing is done by the shell. In Windows, the command tail is passed in as a single string. In Mac, what arguments?

    Win32 and Be have multiple data streams in a file. Mac has resource forks. *nix has nada. Beware copying files by simply rewriting the contents!

    *nix can unlink an open file. Win32 can't (not sure if it's inherent or has to do with the flags it uses).

    *nix uses "magic number" at beginning of a file and #! to associate files; Mac has metadata for that; Windows uses extensions, metadata, and/or general content pattern matching.

    Symbolic and hard links.

    Single-rooted file system vs. a forest of individual volumes.

    deamons vs. services (what's on a Mac?)

    Forking as an OS primitive vs. Process model.

    —John

Re: Platform Differences
by marcink (Monk) on Aug 07, 2001 at 16:21 UTC
    Hi,

    Just a quick comment to #3:

    One-shot signal handling (the way you attribute to Win32) is also the traditional way on Unix -- I know it was used on older versions of System V (prior to svr3) and I'm pretty sure that many of currently used 'nix flavors inherited it. The ugly thing is that even on systems that use reliable signal handlers (like Linux) you can change this behaviour by adding a simple library to an application at linking time.

    And don't even get me started on SIGCHLD differences :)

    In short, I agree with tye that it's better to avoid using signals, although from somewhat lower-level reason ;)



    -marcink
Re: Platform Differences
by talexb (Chancellor) on Dec 18, 2001 at 20:48 UTC
    With regard to select ..

    For example, perlman:perlport2 says that select is implemented only on sockets in Win32. What does that mean to you? Why would you want to select on non-sockets? That's what this Meditation was meant to be about.

    Actually, I use select <filehandle> in a Perl script that runs as a CGI on a Windows NT box (ugh) and it works fine. Looking at my code, it appears that it made more sense to open the file (and catch the failure), and then pass the file handle to the subroutine, which saves STDOUT (the web page), selects the new file handle, writes to it, closes it (phew), then restores STDOUT before returning.

    "Excellent. Release the hounds." -- Monty Burns.