Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

This code works successfully from the command line:

    sed 's/^ \{2,\}//' -i filename.html

But that exact same code doesn't run when it's in a Perl script. No error message, the script itself runs, but my file doesn't get changed. The only difference from the command line version is that in the script I enclose the code in backticks and add a semicolon at the end.

I did some troubleshooting and found that my sed command would run properly when called from my Perl script as long as the regexp was simple (e.g., s/blue/green/) , but it failed when I either used the ^ operator to signal the start of a line, or the \{2,\} notation to specify 2 or more matches.

I know that I could get the result I want with Perl exclusively by opening the file, reading the contents, applying a substitution regexp, saving the file back to the disk, and closing it, but that seems rather cumbersome when I should be able to do it with a single line.

I do really need to run this from a script rather than the command line, because once I get it working I'll actually be processing thousands of files, and using Perl to evaluate filenames and last mod dates to determine whether a particular file needs to be edited.

What am I doing wrong?

Replies are listed 'Best First'.
Re: sed regexp works on the command line, but not from Perl script
by Corion (Patriarch) on Dec 18, 2006 at 11:37 UTC

    Why don't you just show us your code instead of describing what you did to convert the code shown to the code you claim to use?

    I assume from your description that your code looks like the following:

    `sed 's/^ \{2,\}//' -i filename.html`;

    This is bad of course, as you don't check for the system result, for error messages or for filename.html being in the current directory. But your current problem seems to lie elsewhere because you describe that your sed invocation works properly as long as the regexp is simple. I don't believe you that the inclusion of ^ was the only thing - you are falling into the trap that backticks act like double quotes and all your backslashes get interpreted in the first round of backslash interpretation by Perl and never get seen by sed. Hence one solution could be to double all your backslashes to make sure that sed sees them.

    A saner solution would be to use system instead of backticks:

    system('sed','s/^ \{2,\}//', '-i', 'filename.html') == 0 or die "Couldn't launch sed: $! / $?";

    Even saner would be to remove the need for sed alltogether, and it isn't that long either:

    use Tie::File; my $filename = 'filename.html'; tie @file, 'Tie::File', $filename or die "Couldn't open '$filename': $!"; for (@file) { s/^ {2}//; };

    Especially if you'll be processing thousands of files, an all-Perl solution will be faster than launching a separate instance of sed for every file.

      Thank you very much for your help, and I'm sorry for all my mistakes. Please understand that if I were as versed as you all then I probably wouldn't be writing for help about this.

      There is no need to "assume" what my code looks like, since I described it exactly. I'm sorry that troubled you. In any event, you did list it exactly as I described it.

      When you say, "This is bad, of course...", it was not "of course" to me that it was bad, and if it were then I wouldn't have done it that way.

      On checking whether filename.html is in the current directory, my full program first scans the directory for a list of files to work on, and then gets to work on each one, so I'm reasonably certain that the files are there, as no one else has access to the system besides me.

      As for "I don't believe you that the inclusion of ^ was the only thing," I'm sorry that you think I am lying to you, but I am not.

      When you said, "A saner solution would be to use system instead of backticks," I do not doubt you, but since I am a novice, I do not understand what is "insane" about using backticks.

      I used the Tie::File suggestion and it works for me just fine. After reformatting was able to get it down to two lines, which I prefer. Thank you very much for that solution. I am still interested in getting the sed version to wor too, if for no other reason that on occasion I will need to process only one file and I am attracted to a one-line solution.

      I don't think I was clear about my goal: It is to remove all leading spaces on *every line* of various files.

      In the following examples, I collected the changed text into an array and printed the array so I didn't actually change the file, although in my final solution I will change it to actually write the file back to the disk.

      This successfully removes all the spaces in the file, as a test, though that is not my ultimate goal:

          @output = `sed 's/ //g' $filename`;

      This removes the very first character, which is a space, but doesn't affect any other lines besides the very first line. This is a problem because I'm trying to process every line.

          @output = `sed 's/^ //g' $filename`;

      In my initial testing I did not realize that sed was indeed processing the first line, because the first character in the file used to be a nonspace, so I didn't see any change after I ran the script. I added an initial space to the file for testing.

      The sed documentation seems to indicate that sed does not allow the "m" flag from Perl-style regexps which I understand to make the ^ operator work on every line, not just the first line. Then again, my understanding is that sed operates line-by-line by default, and indeed it works that way when I run it from the command line. However I can't get it to work properly when I call it from a Perl script using backticks.

        I'm sorry that my reply put you in such a defensive mindset about your code that you skipped the part where I explained the cause of your problem because I pointed out several other problems as well and I now see that I started every sentence that explains the problem with a pointer to another problem as well.

        Let me repeat the essential part of my above reply:

        ... you are falling into the trap that backticks act like double quotes and all your backslashes get interpreted in the first round of backslash interpretation by Perl and never get seen by sed. Hence one solution could be to double all your backslashes to make sure that sed sees them.
Re: sed regexp works on the command line, but not from Perl script
by Jasper (Chaplain) on Dec 18, 2006 at 12:06 UTC
    It seems to me that the escaping backslashes are almost certainly the issue here. no?

    But like the other guy says, why not rewrite it in perl?
Re: sed regexp works on the command line, but not from Perl script
by swampyankee (Parson) on Dec 18, 2006 at 23:21 UTC

    To add to the groundswell of support for rewriting your sed script into Perl: check out s2p, which will translate sed to Perl.

    emc

    At that time [1909] the chief engineer was almost always the chief test pilot as well. That had the fortunate result of eliminating poor engineering early in aviation.

    —Igor Sikorsky, reported in AOPA Pilot magazine February 2003.
Re: sed regexp works on the command line, but not from Perl script
by blazar (Canon) on Dec 19, 2006 at 14:08 UTC
       sed 's/^ \{2,\}//' -i filename.html
    [snip]
    What am I doing wrong?

    As others pointed out, you were singularly both verbose and uninformative enough, so I don't know. But Perl's closest equivalent to the above would be

    perl -pi -e 's/^ {2,}//' filename.html

    Hope it will help you to get started.