Running a string through a filter program and recieving the output as a string

bwgoudey has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Running a string through a filter program and recieving the output as a string by davidrw (Prior) on Sep 30, 2005 at 02:39 UTC
Here's a basic outline -- you might want to modify the split line. open FILE, "<", $file; while(<FILE>); chomp; my @words = split; foreach my $word ( @words ){ # my $newword = `sed $word`; # UPDATE: original lazy typing my $newword = `echo $word \| sed`; # UPDATE: more realistic print $newword, "\n"; } } close FILE; [download] What exactly is the tr/sed command doing? I suspect that there's a native perl way to do it (e.g. see `perldoc -f tr`). As for using pipes instead of backticks, see the Tutorials and perlipc ... Update: changed the backtick command .. i was just lazy the first time :)	[reply] [d/l] [select]
Re^2: Running a string through a filter program and recieving the output as a string by Skeeve (Parson) on Sep 30, 2005 at 05:35 UTC
That's it almost. By saying "filter program" I think the OP means the words have to be piped to the filter program. Filters usually take STDIN as input. Also, from the description of the OP I assume his program will deal with different filters, set by (e.g.) commandline parameters. So my $newword = `sed $word`; [download] should become my $newword = `echo '$word' \| $filterprogram`; [download] Where $filterprogram has to be set somewhere else in the script. `$\=~s;s.;q^\|D9JYJ^^qq^\//\\\///^;ex;print`	[reply] [d/l] [select]
Re^2: Running a string through a filter program and recieving the output as a string by pg (Canon) on Sep 30, 2005 at 03:10 UTC
"What exactly is the tr/sed command doing?" From unix man page, in a nut shell: "Sed is a stream editor. A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline). While in some ways similar to an editor which permits scripted edits (such as ed), sed works by making only one pass over the input(s), and is consequently more efficient. But it is sed's ability to filter text in a pipeline which particularly distinguishes it from other types of editors." "The tr utility copies the standard input to the standard output with substitution or deletion of selected characters. The options specified and the string1 and string2 operands control translations that occur while copying characters and single-character collating elements."	[reply]
Re^3: Running a string through a filter program and recieving the output as a string by ikegami (Patriarch) on Sep 30, 2005 at 04:55 UTC
I think he meant in this case specifically, since Perl can do `tr` and `sed` tasks quite easily.	[reply] [d/l] [select]
Re^4: Running a string through a filter program and recieving the output as a string by bwgoudey (Sexton) on Sep 30, 2005 at 08:52 UTC
Re^5: Running a string through a filter program and recieving the output as a string by graff (Chancellor) on Sep 30, 2005 at 22:57 UTC
Re: Running a string through a filter program and recieving the output as a string by graff (Chancellor) on Sep 30, 2005 at 03:04 UTC
Perl can be viewed as an implementation of sed, tr and awk, along with many functions of the shell and of C. You won't need to run command lines from a perl script in order to do tr and sed work, because that sort of thing is an integral part of (and much more capably implemented in) Perl. In fact, you won't want to run commands for that kind of stuff, especially on a "one command per word" basis, because the overhead of launching and closing down a new shell to run such a command for every word will slow you down tremendously. Just use the "tr///" and "s///" operators in Perl, along with the standard IO operations to read/write lines, and split to divide each line into individual words.	[reply]
Re: Running a string through a filter program and recieving the output as a string by ioannis (Abbot) on Sep 30, 2005 at 06:03 UTC
You can always use open2 if you want a bidirectional pipe. This program pipes the command to tr(1) to get back a modified command: `use IPC::Open2; open2 ( REA, WRI, 'tr a-z A-Z' ) or die $!; for (<DATA>){ print {\*WRI} [(my $cmd, my @rest) = split]->[0]; } close WRI; print <REA>; __END__ an apple a day boat for bloat` [download]	[reply] [d/l]
Re^2: Running a string through a filter program and recieving the output as a string by Moron (Curate) on Oct 20, 2005 at 14:57 UTC
Although when using SystemV (Sunos 5.9) today, I had a complaint about a defunct process being left even after closing both the read and write channels - this being in a subroutine that handled repeated communications with a time-series database language interpreter. At some point there was one defunct process for each time I had called the subroutine. To fix this, I had to modify the code to specifically tell the parent to wait. It seems silly that I should have to do that, but on the other hand it is indeed partially documented in perlipc. Here is what I had to do to fix it, but applied to the above piece of code instead of mine: `use IPC::Open2; use POSIX ":sys_wait_h"; my $pid = open2 ( REA, WRI, 'tr a-z A-Z' ) or die $!; for (<DATA>){ print {\WRI} [(my $cmd, my @rest) = split]->[0]; } close WRI; print <REA>; close REA; # but believe it or not, under sysV the zombie is still the +re! waitpid($pid,0); # this will finally kill it cleanly! __END__ # although exit from the program will eventually kill it of co +urse an apple a day boat for bloat` [download] -M Free your mind*	[reply] [d/l]
Re: Running a string through a filter program and recieving the output as a string by TomDLux (Vicar) on Sep 30, 2005 at 13:41 UTC
Piping things to external programs is slow. Starting up sed will involve many microseconds or even milliseconds to spawn an external process, and if you do this in an inner loop, once for each of a few million words, it soon adds up to a long time. It's also unneccessary, since Perl has the ability to perform such transformations itself: `my $old = 'hat\|coat\|shoe'; my %new = ( cap => 'hat', jacket => 'coat', shoe => 'boot', ); for my $word ( @words ) { $word =~ tr/[A-Za-z]/[M-ZA-Lm-za-l]/; # decode ROT13 $word =~ s/($old)/$new{$1}/ei; # convert cheap clothes int +o expensive ones }` [download] -- `TTTATCGGTCGTTATATAGATGTTTGCA`	[reply] [d/l]
Re: Running a string through a filter program and recieving the output as a string by Moron (Curate) on Oct 03, 2005 at 14:43 UTC
It is almost essential to use a bidirectional pipe (as already described by ioannis) when a current step in the processing has to be done outside perl before continuing - what IS too slow would be to close and reopen the pipe more than once in cases where one can pipe a complete dataset in (using print) and read the processed results back out processing them exhaustively in perl before closing the read pipe. -M Free your mind	[reply]