elwoodblues has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

this is probably a simple one, but like most simple things, it has me stumped

when I run this from the command line, it works fine

cat text.txt|wc -l| sed 's/^[ \t]*//'|sed 's/[ \t]*$//'
but when I call it from perl with a system call
#!/usr/bin/perl system( "cat text.txt|wc -l| sed 's/^[ \t]*//'|sed 's/[ \t]*$//'" );
I get an error sed: 1: "s/ \t*\n/": unterminated substitute pattern

Yeah, I know there is an obvious mistake, but I just can't see it. Could someone please point out the obvious?

Replies are listed 'Best First'.
Re: works on command line, but not from perl
by graff (Chancellor) on Feb 12, 2010 at 02:19 UTC
    When you put that string into the system call, you have to escape the "$" in and the backslashes in the sed args:
    system( "cat text.txt|wc -l| sed 's/^[ \\t]*//'|sed 's/[ \\t]*\$//'" ) +;
    (not tested). The point is that perl will interpolate "\t" and "$/" before they go to the shell, unless you put the necessary escapes in the perl script.

    Curiously, I don't see anything being done with the results of that command line. If you were intending to use the output in your perl script, you should be using backticks. Otherwise, you should be redirecting the output to some file or something.

    Anyway, why not save yourself all that shell overhead, and do those operations with perl code instead of sed?

    (updated to remove extraneous code tag and fix spelling errors)

      Thank you. I cut the problem down to the minimum before posting, taking out the other stuff I was doing with the return value, etc.
Re: works on command line, but not from perl
by roboticus (Chancellor) on Feb 12, 2010 at 09:41 UTC

    elwoodblues:

    This is a time when spending a little bit of time thinking about the error messages can bge helpful. The error message is telling you that sed doesn't like the pattern you gave it. Specifically, sed is complaining that it doesn't understand "s/[ \t]*\n/". But in your code you're attempting to tell sed the pattern "s/[ \t]*$//\n". So somewhere, the character string "$/" is being translated to "\n". So you need to read about variable interpolation and the special variables.

    Variable interpolation is when perl replaces a variables name with a variables value in certain locations (such as a double-quoted string). You can turn off the interpolation by changing the quoting method you use. In this case, changing your command to the following would do the trick:

    system(q{cat text.txt|wc -l| sed 's/^[ \t]*//'|sed 's/[ \t]*$//'});

    If you read about the special variables (perldoc perlvar), you'll find that $/ is the input record separator, and you might notice that the transformation from $/ to \n makes a little more sense.

    Anyway as graff mentioned, you generally want to understand what the original is doing rather than blindly trying to convert the script. In this case, the original script performs several steps:

    1. cat text.txt| copies the file text.txt to the standard output stream. (See Useless use of cat award)
    2. wc -l| counts the number of lines in the input stream. (The reason cat is useless is that wc will happily accept the filename on the command line, saving the overhead of an entire process.)
    3. sed 's/^[ \t]*//'| uses sed to trim leading spaces and tabs from the beginning of each line, and
    4. sed 's/[ \t]*$//' uses sed to trim trailing spaces and tabs from the end of each line. Another process could be saved here by combining the sed commands.

    So, overall, the entire command string simply reads the number of lines in the file text.txt. Now that you know what the command string does, you could then try to convert that to perl. It's relatively easy. You first need to open the file:

    open my $INPUTFILE, '<', 'text.txt' or die "Can't open file! $!";

    Then read each line in the file, incrementing a counter for each line you read:

    my $line_count=0; while (<$INPUTFILE>) { # read a line ++$line_count; # increment counter }

    Then close the file to clean up after yourself:

    close $INPUTFILE or die "close error: $!";

    And now you're left with the line count in $line_count, so you can proceed with the next step in your script.

    So by taking graff's advice, you'd learn more about what your original script is doing, and you'd learn more about perl faster as well when you write perl code to solve the intermediate steps rather than copying a shell script without understanding it.

    While your way may be a bit faster to get a working script, it'll be a longer road to truly learning perl.

    ...roboticus

      Then read each line in the file, incrementing a counter for each line you read

      Perl already does that for you:

      1 while <$INPUTFILE>; my $line_count = $.;

        jwkrahn:

        Yeah, I was aware of that. I just didn't want to introduce him to too much magic at once. My response was already long enough. ;^)

        ...roboticus

Re: works on command line, but not from perl
by ahmad (Hermit) on Feb 12, 2010 at 02:20 UTC

    Why don't you make it in perl instead of command line ?

    There are more than one way to do it

    my $file = 'text.txt'; open my $fh, '<' , $file or die $!; my @data = <$fh>; close $fh; print scalar @data;

    If you still insisting on doing it through the command line then this might work

    system( 'cat text.txt|wc -l| sed \'s/^[ \t]*//\'|sed \'s/[ \t]*$//\'' +);
      Thank you. This is part of a much larger bash script that I'm porting. Yes doing it in entirely in perl was the long term goal, but in the first instance, I just wanted to get it running, even using system calls, as it saved me time (or so I thought).
        This is part of a much larger bash script that I'm porting.

        Then one of the first things you should be doing is understanding what the shell script was trying to do, instead of just wrapping "system()" calls around things.

        The idea is the get the same things done with code that is shorter, more efficient, easier to understand, and more resilient to a wider range of edge and error conditions. Using four independent processes in a system call just to get a line count on a file (or things to that effect) seems like the wrong way to go about it.