in reply to Help for awk/regex/newbie

The following snippet will print out the first token of each
line in the file try.asc
The first parameter to split contains the token seperators, in this
case just space ASCII 040
$file = "try.asc"; open(MYFILE, $file) or die qq(Cannot open $file\n); $line = <MYFILE>; while ($line ne "") { @tokens = split / /, $line; print qq($tokens[0]\n); $line = <MYFILE> }
Hope this helps

Replies are listed 'Best First'.
RE: Re: Help for awk/regex/newbie
by cmenser (Initiate) on Aug 14, 2000 at 15:10 UTC
    first off thank you for your quick response, but

    this is what I am currently using. Is there a method of pulling out the matching string from a regex???
      You can use the special variable $& to get the whole string that matched. Like this:
      if (/^\S+/) { print "$&\n"; }
      Or you can put parenthesis around the part that you want to extract, and reference them by $1, $2, etc. Like this:
      if (/^(\S+)\s+/) { print "$1\n"; }
      Both of the cases above extract any non-whitespace characters at the beginning of the line. Although if that's all you want to do, you are probably better off using split as suggested by others in this thread. It's simpler and probably more efficient. Use regular expressions if you only want to match certain lines that satisfy certain conditions.

      So to strictly emulate the shell/awk command line you gave, you can use this:

      xcommand | perl -nae 'print $F[0],"\n"'
      The -a flag causes it to automatically split each line into whitespace-separated fields, leaving the result in the @F array. The -n option puts a while(<>) { ... } loop around the code. The code itself is specified by the -e option.

      From you question, it seems as if you want to provide the command to execute as input to the program. In that case you could do something like this:

      my $command="xcommand"; open(CMD, "$command |") or die "Error: $!\n"; while(<CMD>) { print (split(" "))[0]."\n"; # or whatever else } close(CMD);

      --ZZamboni

      #!/usr/bin/perl -w use strict; my @cleanarray = <>; foreach (@cleanarray) { print "$1\n" if m/(\S+)/; }
      Update: Fixed the loop to only print if the match succeeds, per merlyn

      Now that I think about it though, a mock awk really ought to look like this:

      #!/usr/bin/perl -w use strict; my @cleanarray = <>; foreach (@cleanarray) { # The regex always matches so there's no need # for a conditional. m/(\S*)/; print "$1\n"; }
      That matches awk's behavior on blank lines better than my first stab.

      -Matt

        Awk! (pun intended).

        Never use $1 unless it's within the context of a conditional based on the success of the match.

        Otherwise, you get the previous match. In this case, an empty line would give you a duplicate "first" word from the previous entry!

        -- Randal L. Schwartz, Perl hacker