It is with trepidation that I propose this old chestnut.

I am aware that the collective wisdom of such as Tom Christiansen and the Llama book is that it is not worth the effort to just code a language conversion between (say) Korn shell and Perl.

The reasons seem to be that a) most of the functionality of a shell script is done by external programs, and b) the languages are so different that a different approach/design is required.

Valid points, although a well written shell script will not use more external programs than it needs, and differences have not prevented a2p, s2p or find2perl.

Well I'm giving it a go anyway – someone should.

I want to take the tedium out of converting a shell script. The interesting and rewarding parts of shell to Perl conversion (for me) are mostly taking the multiple (and often unnecessary) calls to external programs and replacing them with fast Perl constructs. I'm happy doing that manually, or using various CPAN modules.

That's not to say that automatic conversion of many common external program calls is not possible.

What really bores the pants off me is converting [ and [[ to (; 'then', 'fi', 'do', 'done', to curlies, removing the 'in' from a 'for', changing –gt to > and > to gt. Replacing rm with unlink, -n to 'defined', putting a $ on the left side of an assignment, and so on ad nauseum. Just don't ask about the trailing semi-colon.

Right now I can convert and run some scripts without manual intervention, but still have a long way to go. This is a huge job, I have been playing with a basic language parser and converter for about 6 months (off and on) and am still in an early phase.

Time to pause again to wonder how many others have trod this path. I can see footprints and traces of torn clothing and blood on the trail.

Cue a monk with a link to someone who did this ten years ago.

Update: After encouragement by other monks, and much gnashing of teeth, I have uploaded an alpha version to CPAN App::sh2p.

9th October 2008: Now on version 0.04 and have gone beta

Replies are listed 'Best First'.
Re: RFC: UNIX shell to Perl converter
by merlyn (Sage) on Oct 09, 2006 at 15:10 UTC
    Well, the most trivial step that would do what you ask was done by me as an April Fool's Joke about a decade ago: the sh2perl translator. However, it's basically a punt, because it puts the entire original script inside a call to system.

    Doing anything more really doesn't gain you much, because the shell doesn't do much, and you wouldn't be able to eliminate a single fork. So you'd be replacing one launcher with another. Why bother?

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

      With respect, the shell can do quite a lot. Korn shell and Bash are more than just program launchers, Bash 3, for example, supports Extended REs. Whole applications are written in Korn, however mis-guided that might be.
      I agree that converting a shell script that just called external programs to a perl script that called the same external programs would be a waste of time, but I hope to replace utility programs with Perl equivalents.
        No, you can't really eliminate any forks, because you don't know who needs to know $? at the right time, or depends on a certain number of kids being present.

        It's like any translation problem... there's the surface syntax, which might appear relatively easy to translate, but then there's the deep semantics: all the side effects of the steps. So you either have to emulate the deep semantics precisely, or you have to analyze and understand enough of the rest of the program to know what you can avoid emulating. Ugh, on either side.

        -- Randal L. Schwartz, Perl hacker
        Be sure to read my standard disclaimer if this is a reply.

      #!/bin/ksh ####### Error Log Script ############################################# +#################### ###################################################################### +##################### function find { flag=0 if [ ! -d ~/logs ]; then mkdir ~/logs fi date=`date +%Y-%m-%d-%s` while true do grep -ni "$keyword" $source | while read line do if [ "$(ls -A ~/logs 2> /dev/null)" == "" ]; t +hen echo $line >> ~/logs/sanity-$date.log flag=1 else result=`cat ~/logs/* | grep -F "$line" +` if [ "$result" != "$line" ]; then echo $line >> ~/logs/sanity-$d +ate.log flag=1 fi fi done done if [ $flag == 1 ]; then echo "######################################## +#############################################" echo "Errors or Exception found. Extraction is + copied in the file ~/logs/sanity-$date.log" echo "######################################## +#############################################" else echo "###################" echo "No new logs Found" echo "###################" fi } action=$1 #Start or Stop dir=$2 file=$3 #Source filename keyword=$4 #Search String case $action in start) if [ "$dir" == "" ]; then echo "Enter the path" exit fi if [ ! -d $dir ]; then echo "Directory not found" exit fi if [ "$file" == "" ]; then echo "Enter the filename" exit fi if [ ! -f $dir/$file ]; then echo "No File found" exit fi source=$dir/$file if [ "$keyword" == "" ]; then echo "Enter a keyword for search" exit fi find exit ;; stop) echo "killed" exit ;; *) echo '######################################## +##############################' echo 'Usage: sh parser.sh [start/stop] [Direct +ory path] [Filename] [Keyword]' echo 'Sample: sh parser.sh start /var/log/ mes +sages "test"' echo '######################################## +##############################' exit ;; esac
Re: RFC: UNIX shell to Perl converter
by NovMonk (Chaplain) on Oct 09, 2006 at 17:08 UTC
    Funny you should mention this, because that's just what my company is about to do. Here's why:

    We have a Boatload of Shell scripts that control various parts of our processes. No standards, very hard to find what's broken when something changes, etc. We want to replace the functionalty of all these scripts with Perl modules and an overarching Perl program to call them, And we want to incorporate testing modules into the process up front so that we can enforce standards, know what's broken when we make changes, and make it easier for new folks to see what's going on.

    I'd love to see if anyone has done something to automate some of the process, but I imagine we'll be doing a lot of the conversion manually, simply because there's a lot of garbage accumulated that needs cleared away.

    Which is a long way of saying, good luck, and I look forward to other answers.

    NovMonk
      Sounds like a fun project with lots of pain. :-)

      How about about patching 'bash' to show what is done? There might even be some debugging stuff in there already, which could be used?

      Then you could get a log of:

      Line X: run program "a" and store output into "b". Line X+1: Do something if $? == c
      It should make it easier to generate Perl code and you could get the return codes, etc.

      Hmm... could generate Perl programs that just did "system" calls and then rewrite them by parts.

        Yep, I looked at that. I also considered 'lifting' the parser from Bash rather than writing my own, seemed daft to write my own parser when there was already a perfectly good one there. Unfortunately I found that it was not so easy to extract it as a separate entity, it is integrated into the whole product. Someone like Chet Ramey could probably do it, not me.
        As for the pain, this is working out to be almost as good as whipping myself with nettles - Ahhhhh. (No nettles were harmed in the writing of this post).
      Thanks for the encouragement, yours is exactly the kind of project this is for. Fancy helping with the testing later?
      It seems to me you've got 3 options there:

      1. auto-translate;
      con: GIGO ie crud shell input => crud Perl output.
      pro: quick/low effort

      2. Hand code new Perl modules as you go.
      con: Takes a while, same cruddy architecture as original
      pro: you'll get good, maintainable Perl that does logging etc. Also means you can replace shell files gradually, instead of the big bang approach.

      3. Design a new/better ctrl system in Perl from scratch.
      pro: Gives you better code and more integrated approach.
      con: more effort upfront & big bang approach

      I once had to do a FORTRAN to C job (Y2K) and we originally tried option (1.), but I ended up doing a fair bit of (2.) to make it work properly.
      A straight option (2.) approach might have been quicker in the long run, otherwise you end up with weird looking code due to the auto-converter.

      HTH Cheers
      Chris

Re: RFC: UNIX shell to Perl converter
by Anonymous Monk on Oct 10, 2006 at 10:23 UTC
    The interesting and rewarding parts of shell to Perl conversion (for me) are mostly taking the multiple (and often unnecessary) calls to external programs and replacing them with fast Perl constructs.

    Here you are suggesting that using a shell and calling other program is necessary (much) slower than doing it all in Perl. I beg to differ. Sure, if all your program is doing is calling a different program hundreds of time, and each time it just takes a fraction of a second to run, the shell loses because of the forking overhead.

    But any Perl program starts with a backstart - the start up time of a Perl program is larger than of a shell script. And the shell typically calls utilities that have been written in C, and have been optimized for decades. Your systems' 'grep' and 'sort' will typically be much faster that the Perl equivalents - the Perl equivalents pay the price of being more powerfull, and having to deal with Perl variables.

      Good point, and I agree, but it's a question of scale. When these programs are called in loops (and often are) it is not unusual to get a script forking hundreds or thousands of child processes. I agree that a well written shell script can be more efficient than a Perl program, particularly when the Perl loads modules. If you have such a script I doubt it would be a candidate for conversion.
Re: RFC: UNIX shell to Perl converter
by halley (Prior) on Oct 11, 2006 at 18:59 UTC

    One group where I worked did such a script, to push a bunch of 'csh' scripts into the new Millennium. We wrote a simplistic mechanical translator. The scripts were a bunch of cut-and-paste monstrosities, and the results were just as cruddy. Don't expect human-readable, maintainable results.

    One word of advice, anything that requires punting on the translator side should at least KNOW that it requires punting. Output a comment with '#REVIEW:' next to any surely-incomplete translated statements, for example. Then you have a little bit less work to do in hunting down the worst problems with the translated scripts.

    The rest of the problems will lurk in obscurity forever.

    --
    [ e d @ h a l l e y . c c ]