in reply to Filters within a filter...

As a former ksh-(ab)user, who has many utilities in perl ... let's just say that, in and of itself, there is little reason to move working programs over ... the only reasons to do this, IMO, are either to add new features which are getting prohibitively expensive (in time) to add in shell, or for the sake of learning perl. I'm going to assume one of these is the case. I still have many shell scripts floating around aside my perl scripts - and perl scripts that generate shell code which I can eval - and I still write new shell code from time to time. Right tool for the right job.

Personally, I would do one of the following - depending on the rest of the code. Remember: TMTOWTDI. Not all of them are great, but often more than one is sufficient or even desirable.

Option one:

sub remove_cruft { my @lines = @_; # do stuff to @lines; return @lines; } # use as: my @no_cruft = remove_cruft(@lines); # or: my @no_cruft = remove_cruft(<FH>);

Option two

# Using a reference: sub remove_cruft { my $lines = shift; # do stuff to @$lines; } # use as: remove_cruft(\@lines); # Note that the following won't work: # remove_cruft(<FH>); # you need to use a temporary variable </code>

Option Three

# Using prototypes (which are considered "evil" by some) sub remove_cruft(@) { my $lines = shift; # do stuff to @$lines } # use as: remove_cruft(@lines) # this still doesn't work: # remove_cruft(<FH>)

Option Four

sub remove_cruft { my @lines; if (@_ > 1) { # lines were passed in. @lines = @_; } # we either got a filename or a filehandle. elsif (ref $_[0]) # assume object is a filehandle. { @lines = <$_[0]>; } else # must be a filename. { my $fh = IO::File->new(shift, 'r'); @lines = <$fh>; } # do stuff to @lines; @lines; }

I realise that's not the most efficient way to do that last one, but I'm too lazy to do all of the work in this little textarea box... :-)

Replies are listed 'Best First'.
Re^2: Filters within a filter...
by mw (Sexton) on Feb 04, 2005 at 16:54 UTC
    * mw nods

    Well, there are two reasons why I'm porting these scripts over: First, as an exercise to learn perl, and second because I'm hoping for increased speed. I suppose most of the HTML files I'm likely to see here, will fit in memory without too many problems, so I think I'll standardise on arrays of lines slurped therefrom.

    Thanks!

      I knew I missed an option or so ...

      Option Five

      sub remove_cruft { my $line = shift; # do stuff to one $line here. $line; } # use as: while (my $l = <$fh>) { $l = remove_cruft($l); $l = remove_other_cruft($l); $l = remove_yet_more_cruft($l); # or ... $l = $_->($l) foreach (\&remove_cruft, \&remove_other_cruft, \&remov +e_yet_more_cruft); # or ... $l = remove_yet_more_cruft(remove_other_cruft(remove_cruft($l))); # on second thought, don't do that last one :-) }

      Option Six

      sub remove_cruft { # do stuff to single line $_[0]; } # use as: while (my $l = <$fh>) { remove_cruft($l); remove_other_cruft($l); remove_yet_more_cruft($l); # or ... $_->($l) foreach (\&remove_cruft, \&remove_other_cruft, \&remove_yet +_more_cruft); }

      The options are endless. What I highly discourage you from doing is writing shell script in perl. I've seen that so many times that it makes me cringe each time. Whether that is to write my $data = `grep blah $filename` rather than open my $fh, $filename; my $data = join '', grep { /blah/ } <$fh>; (and this is just the least perlish of the not-shell-script options), or it's system("mkdir $dir"); rather than mkdir $dir ... there are some really nifty perl idioms that take care of these things for you. They say you can write ForTran in any language. Same is true of shell scripts :-)