cmenser has asked for the wisdom of the Perl Monks concerning the following question:

xcommand | awk '{print $1;}' returns just the first string of each line it is offered.

I need to do the same thing in perl. majikperlcommand(@cleanarray = "xcommand");

I'm sure this can be done with regex, which I am just starting to delve the depths of. And, I don't know how to read out the single string that matches a regex. Or is it better to say "a matching regex that is followedd by a whitespace"?

here is an example of the output:

M8-W5 RUN N955 MHO DEVEL mfaden etc....

Replies are listed 'Best First'.
Re: Help for awk/regex/newbie
by merlyn (Sage) on Aug 14, 2000 at 15:08 UTC
    To do it literally as awk does it, use:
    @tokens = split " ", $line;
    That enables "awk emulation mode", causing leading whitespace to be ignored. Without that, leading whitespace generates an empty first element, and the first non-whitespace stuff being the second element.

    -- Randal L. Schwartz, Perl hacker

RE: Help for awk/regex/newbie
by Yaakov (Novice) on Aug 14, 2000 at 18:03 UTC
    As you use pipes on the command line, I guess you will like the command line switches -e, -p and -n: They allow you to write a short perl program on the commandline, right into your pipes.

    Here are three short solutions for your question. They differ in that the first one prints a final space at the end of the results while the other solutions don't:

    xcommand|perl -wne '/(\S+)/ and print "$1 "' xcommand|perl -we 'print join " ", map{/(\S+)/; $1} <>' xcommand|perl -we 'print join " ", map{(split" ", $_)[0]}<>'

    Let's explain the tools we used in these three solutions:

    The -w switch does the bulk of the work: It tells me when I did something wrong.

    The -n switch in the first example builds a loop around our main program to process the input line by line: while(<>){ ... the code goes here ... }.

    The -e switch reads the next command line argument and executes it as the perl program.

    In the first solution, the program simply says /(\S+)/;: "Search {//} for at one or more {+} non-white-space characters {\S} and remember them all {()}". print "$1 ": print what you have remembered and a space. You remember, this is done for every line of the input.

    The second solution does almost the same thing. Instead of the -n switch, we use map the list of all input lines <> and join the results by spaces. Thus, we do not print an extra space after the last field.

    The third solution uses the split function instead of a regular expression.

    Note: In a Dos-Window (UUUHHHH-OOOOHHH-EEEEEKS), the examples will not work as given because there the "shell" messes the quotation marks up. You have to use ouble quotes (") around your code (and you can't use them inside)!

RE: Help for awk/regex/newbie
by t0mas (Priest) on Aug 14, 2000 at 15:12 UTC
    split(' ') can be used to emulate awk's default behavior (I think..)
    You can try something like:
    my @cleanarray; open(PH,"xcommand|") or die "Can't open xcommand: $!"; while (<PH>) {push @cleanarray, (split(' '))[0];} close (PH);


    /brother t0mas
Re: Help for awk/regex/newbie
by ColtsFoot (Chaplain) on Aug 14, 2000 at 14:51 UTC
    The following snippet will print out the first token of each
    line in the file try.asc
    The first parameter to split contains the token seperators, in this
    case just space ASCII 040
    $file = "try.asc"; open(MYFILE, $file) or die qq(Cannot open $file\n); $line = <MYFILE>; while ($line ne "") { @tokens = split / /, $line; print qq($tokens[0]\n); $line = <MYFILE> }
    Hope this helps
      first off thank you for your quick response, but

      this is what I am currently using. Is there a method of pulling out the matching string from a regex???
        You can use the special variable $& to get the whole string that matched. Like this:
        if (/^\S+/) { print "$&\n"; }
        Or you can put parenthesis around the part that you want to extract, and reference them by $1, $2, etc. Like this:
        if (/^(\S+)\s+/) { print "$1\n"; }
        Both of the cases above extract any non-whitespace characters at the beginning of the line. Although if that's all you want to do, you are probably better off using split as suggested by others in this thread. It's simpler and probably more efficient. Use regular expressions if you only want to match certain lines that satisfy certain conditions.

        So to strictly emulate the shell/awk command line you gave, you can use this:

        xcommand | perl -nae 'print $F[0],"\n"'
        The -a flag causes it to automatically split each line into whitespace-separated fields, leaving the result in the @F array. The -n option puts a while(<>) { ... } loop around the code. The code itself is specified by the -e option.

        From you question, it seems as if you want to provide the command to execute as input to the program. In that case you could do something like this:

        my $command="xcommand"; open(CMD, "$command |") or die "Error: $!\n"; while(<CMD>) { print (split(" "))[0]."\n"; # or whatever else } close(CMD);

        --ZZamboni

        #!/usr/bin/perl -w use strict; my @cleanarray = <>; foreach (@cleanarray) { print "$1\n" if m/(\S+)/; }
        Update: Fixed the loop to only print if the match succeeds, per merlyn

        Now that I think about it though, a mock awk really ought to look like this:

        #!/usr/bin/perl -w use strict; my @cleanarray = <>; foreach (@cleanarray) { # The regex always matches so there's no need # for a conditional. m/(\S*)/; print "$1\n"; }
        That matches awk's behavior on blank lines better than my first stab.

        -Matt

Re: Help for awk/regex/newbie
by lindex (Friar) on Aug 14, 2000 at 18:05 UTC
    Just my 2 cents (ignore if you like), but this is how I did it.
    #!/usr/bin/perl -wn use strict; print((split(' '))[0],"\n") unless($_ =~ /^\s+$/);



    lindex
    /****************************/ jason@gost.net, wh@ckz.org http://jason.gost.net /*****************************/