jpfarmer has asked for the wisdom of the Perl Monks concerning the following question:

In the process of doing research for a class assignment, I've been working on self-printing programs (programs that print out their own source code). The most common basic example I've run across looks like this:

$body = ' $body = %c%s%c; printf($body,34,$body,34); '; printf($body,34,$body,34);

I've played with this code quite a bit, but I simply don't understand how it works. I understand the basic call to printf(), but I don't understand why $body needs to be printed twice. I also don't really get what the '34' portion does. I also don't understand why, if I add extra code to $body, it gets printed twice.

I'd appreciate any help you can give me to explain how this works.

Replies are listed 'Best First'.
Re: Self-printing program
by diotalevi (Canon) on Apr 02, 2004 at 17:17 UTC

    You don't need to make a quine to have a program print its own source.

    open my $fh, "<", $0; print <$fh>;

    Or...

    use O 'Deparse'; do something

    Or without going back to the disk...

    seek *DATA, 0, 0; print <DATA>; __DATA__

      According to the defination on http://www.nyx.net/~gthompso/quine.htm, a quine is a program that generates a copy of its own source text as its complete output. So you do need a quine to have a program print its own source, since that is exactly the definition of a quine :)

      ----
      : () { :|:& };:

      Note: All code is untested, unless otherwise stated

        I was thinking of the other variation where people stick a copy of the program's source in the program source. I regularly write my perlmonks CGIs to include a method for printing the program's source and I've never stopped to consider that my two-line view_source() function makes the CGI a quine.
Re: Self-printing program
by calin (Deacon) on Apr 02, 2004 at 17:10 UTC

    Search Google for quines. Here's the Wikipedia article.

    Update: The key to understand how your program works is the printf documentation. Here's the content of the $body variable:

    $body = %c%s%c; printf($body,34,$body,34);

    And now let's see what the printf invocation below does:

    printf($body,34,$body,34);

    The first argument, $body, is the format string. The following arguments, 34, $body, 34 will be interpolated according to format conventions. Second and fourth arguments (34 and 34) are treated as ascii character codes and the corresponding character, double-quotes, will be interpolated in the place of %c's. Third argument of printf will be interpolated as a string in the place of the %s. Therefore, the output will be equivalent to the concatenation of the following strings:

    ' $body = ' . chr(34) . $body . chr(34) . '; printf($body,34,$body,34); '

    Try printing that and you'll see

Re: Self-printing program
by hardburn (Abbot) on Apr 02, 2004 at 17:15 UTC

    (Guessing without running the code)

    $body isn't being printed twice. Notice that in the opening line, there is a quote which isn't being closed. You have to scan to the next single-quote mark to see the whole statement.

    P.S., there is a much subtler way to write a self-printing program (more often called a "quine") in Perl using __DATA__. When you put __DATA__ at the end of your program, Perl's compile phase leaves the filehandle to the source code open and makes it available to the program with a filehandle named DATA. This is normally used for putting static data at the end of your source code, but you can seek this filehandle back to the beginning, read it in, and print it out just like any other file. Here's the code to demonstrate:

    #!/usr/local/bin/perl use strict; use warnings; seek DATA, 0, 0; while(<DATA>) { print } __DATA__

    ----
    : () { :|:& };:

    Note: All code is untested, unless otherwise stated

Re: Self-printing program
by etcshadow (Priest) on Apr 02, 2004 at 17:55 UTC
    What you've got there is a very basic quine, according to the recursion theorem (which for you listeners is the theoretical basis for a lot of cool stuff).

    An interesting extension of that theorem, which makes fun use of perl's eval operator, is to build an arbitrarily complex quine with the same simple structure as outlined in the recursion theorem. The difference, though, is that the recursion theorem is based on the idea of breaking up your program into two pieces, each of which know how to print the other half. This basically involves duplicating all of your code, because you have to write it once where it executes, and once where it prints. Using eval allows you to turn that idea inside out (sort of), and divide your program into two pieces, one of which prints the other half, and itself, but doesn't execute... but the other piece can execute the first piece. That way, the majority of your code only needs to be written once.

    my $program = <<'END'; # put any arbitrary code here print "my \$program = <<'END';\n${program}END\neval \$program\n"; END eval $program

    (Oh, and by the way... would it help you to understand your code if you knew that 34 was the ascii character code for the quotation mark, and that %c is the printf marker for printing a character by its ascii character code? Good luck... the recursion theorem is fun stuff.)

    Update: one too many newlines caused the quine to grow by an empty line with each run. sloppy.

    ------------ :Wq Not an editor command: Wq

      You're example using eval is excellent. I attempted something similar earlier, but my recursive call wasn't correct, but looking at your example, I understand why.

      The other part of this program is to make the program output and arbitrary range of program lines, and trying to duplicate that code via the method I was using earlier was a real pain. Using this method, it should be much easier, although I wish I could execute code that was in an array directly, because it would make my life easier.

      UPDATE: I implemented my existing code using this method, and it works very, very well. Thank you again for your help!
        Well, while this method is neat, note that it is not exactly the same thing as the recursion theorem. You should really understand the recurion theorem, or you'll probably be sorry, in the end.

        The simple way to add arbitrary code to a (recursion-theorem-based) quine is like so:

        $body = ' $body = %c%s%c; foo(); bar("baz"); printf($body,39,$body,39); '; foo(); bar("baz"); printf($body,39,$body,39);
        Note that, when doing it this way (as I mentioned), you end up having to duplicate the code (once to print it, and once to do it).
        ------------ :Wq Not an editor command: Wq
Re: Self-printing program
by Roy Johnson (Monsignor) on Apr 02, 2004 at 17:35 UTC
    Surprisingly, no one has actually answered your question.

    The program is a single printf, taking $body as the format string, and three arguments, which will be substituted in for the %c, %s, and %c in the format string. 34 formatted as a char (%c) is a single quote. $body, as you should see, is defined as a string, before the printf.

    So the printf will print everything in $body up to the %c, then a single quote, then the entire text of $body, then the other single quote, then everything after the final %c.


    The PerlMonk tr/// Advocate
Re: Self-printing program
by matija (Priest) on Apr 02, 2004 at 17:44 UTC
    $body isn't being printed twice. The first parameter of printf is the FORMAT string (perldoc printf).

    So the printf prints the character with the ASCII code 34, then the value of body (which spans several lines - find the closing quote character), then another character with ASCII code 34.

    Note that this is a bug. Code 34 is a double quote, and you don't get the same program. The program you get is non-functional after two steps. It should be 39, not 34.

Re: Self-printing program
by ambrus (Abbot) on Apr 02, 2004 at 21:37 UTC
    print<< x2,v10 print<< x2,v10

    And delete the control-m's to get a real one.

Re: Self-printing program
by jpfarmer (Pilgrim) on Apr 02, 2004 at 17:33 UTC

    Thank you all for your suggestions.

    I did try operating on the DATA filehandle, but one of the assignment constraints is that the program adhere to the recursion theorm, which precludes reading the source directly. That's the reason why I'm playing around with this kind of code.