in reply to Perl Directive

I'm going to comment with respect to version 2.2 of the Linux kernel here so some of the follow may not apply to your system or even the modern kernel version. Still I find it really interesting and I think you will too.

Today Linux's main binary file format is called ELF (Executable and Linking Format). Linux has also supported other binary formats, these are options in your kernel setup.

When you go to run a file as if it were a program the file is checked to see if you have the execute permission on it. After that check the kernel tries to match the file to a known format (ELF, a.out, and so on). This matching is done by looking for "magic numbers" in the file somewhere. I think Microsoft DOS .EXE files always start with the the first two bytes being the ASCII letters "MZ" for example.

The last check in the list of file formats is for a file that starts with "#!" because this has been used for years in Unix-like systems to mean that the kernel should go find the filename specified after the "#!" and run that program. The name of the file you actually tried to execute gets passed as an argument to the process really created from the "#!" filename.

Though it is not often done (that I know of) you can actually tell Linux to match executable file formats based on a filename extension like DOS and I think windows does for .EXE files. (I don't know how Windows deals with these things but I think it is based on filename extension. For that matter I don't know how DOS does it either.)

If you're interested in this look for the binfmt_misc.txt file in the Documentation directory in the kernel source tarball. Before you do decide to use it I caution that I don't know of any companies that use it as a standard. The "#!" line is going to be much more portable.

Replies are listed 'Best First'.
Re: Perl Directive
by jonadab (Parson) on Dec 13, 2004 at 13:37 UTC
    to match executable file formats based on a filename extension like DOS and I think windows does for .EXE files. (I don't know how Windows deals with these things but I think it is based on filename extension. For that matter I don't know how DOS does it either.)

    Matching the filename extension is how it is done in DOS (all versions of which I am aware, not just MS-DOS but also compatibles such as PC-DOS and 4DOS) and in the old 3.x versions of Windows (with the winfile.exe file manager), plus all versions of Win95/98/Me (both command.com and also explorer.exe do it this way). You can test this by renaming an .exe file with a different extension and noting that you can no longer execute it, not even by typing the full path, filename, and new extension, not even with the start command. (There is rundll, but that's another matter.)

    In NT (e.g., WinXP), explorer.exe works exactly the same way in this respect as on Win9x. However, I'm less certain about the cmd.exe command interpreter that comes with NT, since I've barely used it. (By the time WinXP came out, I was pretty well entirely switched over to Mandrake already, so I've only seen it at work, and even there my main workstation is Mandrake.)

    However, as noted, Perl on Win32 still reads the shebang line.


    "In adjectives, with the addition of inflectional endings, a changeable long vowel (Qamets or Tsere) in an open, propretonic syllable will reduce to Vocal Shewa. This type of change occurs when the open, pretonic syllable of the masculine singular adjective becomes propretonic with the addition of inflectional endings."  — Pratico & Van Pelt, BBHG, p68

      In Windows NT/2000/XP, it actually works the same way as Unix, that is, by looking for magic numbers in the file.

      There are actually 3 parts of the system that work together -- the NT kernel, the Windows Shell (aka explorer.exe), and the Command Prompt (aka cmd.exe).

      When you click on a file in the Shell or type its name into the Prompt, the API function ShellExecute() is called, which looks up the file association in the registry and starts the right program to open that file.

      The NT kernel does not care about the file's name. The Kernel is accessed by the API call CreateProcess(), which starts the program by looking for magic numbers (.exe format files start with MZ, .com format files have no magic numbers in them). (The kernel function is actually called NtCreateProcess, but programs should only call CreateProcess because Windows 9x had a different Kernel than the NT family has.)

      You can see this in action by renaming helloworld.exe to helloworld. You will not be able to run it from the Command Prompt, but Perl can still run it with system("helloworld"). If you rename it to helloworld.com, it will still work even though the extension is wrong, because the Kernel does not use the file's name.

        You can see this in action by renaming helloworld.exe to helloworld. You will not be able to run it from the Command Prompt, but Perl can still run it with system("helloworld"). If you rename it to helloworld.com, it will still work even though the extension is wrong, because the Kernel does not use the file's name.

        Interesting. However, if you rename it helloword.txt and double-click it, Explorer will try to open it in Notepad (by default). So Explorer is still using the old filename mechanism to determine whether it's executable or should open with another program such as Notepad or Perl. But once it determines that it's directly executable (exe, com, or whatever), it then uses another mechanism, different from what Win9x uses. This is actually not surprising, but it's interesting. It's also interesting that Perl's system bypasses the former mechanism, assumes the thing's executable, and jumps straight to the execution system call. One supposes this means that using system in a Perl script to call up a document will not launch the associated application. (One also supposes there's a Win32:: module that will allow you to do that, however, if it's what you want to do. I'm guessing, though.)

        It's also useful in certain edge cases. For example, you can rename an .sxw file to .zip, open it as a Compressed Folder, make a minor change to the XML (e.g., removing an unwanted paragraph, which the OOo UI in certain cases won't let you do, e.g., the only paragraph before or after a table), rename it back, doubleclick it, and it opens in OOo again. On *nix you'd just use the command-line zip/unzip to do this, but that's more cumbersome on Windows, so the ability to manipulate the associations is a nice compensation.

        Out of curiousity, what if you rename an EXE with a .bat extension and double-click it?


        "In adjectives, with the addition of inflectional endings, a changeable long vowel (Qamets or Tsere) in an open, propretonic syllable will reduce to Vocal Shewa. This type of change occurs when the open, pretonic syllable of the masculine singular adjective becomes propretonic with the addition of inflectional endings."  — Pratico & Van Pelt, BBHG, p68