bagyi has asked for the wisdom of the Perl Monks concerning the following question:

Hi perlmonks,

I'm having some problems in program design. I've a stream of data (array in my case), I would like to have multiple processing on each element.

e.g
stream --> | processor 1| ---> | processor 2| ----> .etc sub processor_1 (@) { } sub processor_2 (@) { }

Let's say processor 1 is a counter that counts element type. processor 2 is looking for min, max elements in stream. Also I have already separate function for processor 1 and 2 that works on array. Is there any way to compose these 2 functions?

I don't want to use a big for loop. map won't work either. I think what I want is called 1 producer, multiple consumers.?

Thanks! Bagyi

Replies are listed 'Best First'.
Re: multiple consumer of an array
by AppleFritter (Vicar) on Oct 18, 2015 at 10:29 UTC

    Sure, this is possible. Here's a quick-and-dirty solution where each processor_ subroutine is called on the same data sequentially:

    #!/usr/bin/perl use Modern::Perl '2014'; use List::MoreUtils qw/minmax/; use Data::Dumper; # count odd/even elements sub process_1(@) { my %result = (); foreach my $element (@_) { my $type = ($element % 2) ? "odd" : "even"; $result{$type}++; } return \%result; } # find minimum/maximum value sub process_2(@) { my @minmax = minmax(@_); return \@minmax; } sub compose { my @subs = @_; return sub { my @results = (); # apply each sub in turn and save the result foreach my $sub (@subs) { push @results, &$sub(@_); } return \@results; } } # sample data my @data = (1, 6, 813, 472, 134, 48, 99, 1398, 47); # compose subroutines my $process_all = compose(\&process_1, \&process_2); # apply all subroutines my $result = &$process_all(@data); say Dumper $result;

    Is this what you're looking for?

    BTW, you may also be interested in Dominus's Higher-Order Perl (also available as a PDF here).

Re: multiple consumer of an array
by Laurent_R (Canon) on Oct 18, 2015 at 11:16 UTC
    I definitely agree with AppleFritter: take a look at Mark-Jason Dominus's absolutely brilliant book, Higher Order Perl, Transforming Programs with Programs (available on line: http://hop.perl.plover.com/book/).

    The whole book, or almost, is about composing functions the way you want, but Chapter 4 about Iterators is probably the most relevant to what you are doing (except that you probably need to read at least in part the preceding chapters to fully understand chapter 4).

Re: multiple consumer of an array
by johngg (Canon) on Oct 18, 2015 at 13:58 UTC

    This may not help you as it requires modification of your subroutines and assumes they will always be used together. It makes use of the fact that calling a subroutine from inside another in the form &sub_name without any parentheses or arguments passes the arguments (@_) of the calling routine to the called one. Obviously, modification of arguments before calling subsequent routines will break this model and careful consideration of order is required, e.g. we can't calculate the average before we've got the sum.

    $ perl -Mstrict -Mwarnings -E ' my $rsSum = \ do { my $dummy }; my $rsAvg = \ do { my $dummy }; sub sum (@) { ${ $rsSum } += $_ for @_; &avg; } sub avg (@) { ${ $rsAvg } = ${ $rsSum } / scalar @_; } sum( 3, 4, 5, 6, 7 ); say qq{Sum - ${ $rsSum }\nAvg - ${ $rsAvg }};' Sum - 25 Avg - 5 $

    I hope this is of interest.

    Update: Regarding order, calling &sum from sub avg before calculating the average also works.

    $ perl -Mstrict -Mwarnings -E ' my $rsSum = \ do { my $dummy }; my $rsAvg = \ do { my $dummy }; sub avg (@) { ∑ ${ $rsAvg } = ${ $rsSum } / scalar @_; } sub sum (@) { ${ $rsSum } += $_ for @_; } avg( 3, 4, 5, 6, 7 ); say qq{Sum - ${ $rsSum }\nAvg - ${ $rsAvg }};' Sum - 25 Avg - 5 $

    Cheers,

    JohnGG

Re: multiple consumer of an array
by shmem (Chancellor) on Oct 18, 2015 at 22:24 UTC

    Of course that is possible. Perl offers all that is needed; but what is your desired output? Im guessing. You want a pipeline, a.k.a. garbage in- garbage out, so you basically want to return the (possibly modified) input as output, used as input for the next pipeline subroutine.

    If you make your stream into an object which is returned by each processor, you can chain your processors like methods:

    my $result = $obj->processor_1->processor_2($argument)->processor_3;

    If you want to accumulate results for each pass, you could include a result container in the object itself. This is one way to do it, TIMTOWTDI of course:

    #!/usr/bin/perl use List::Util qw(min max); use Data::Dumper; my $ary = [12,3,4,5,7,15,8]; my $obj = bless [ $ary, {} ], __PACKAGE__; sub count { $_[0]->[1]->{count} = @{$_[0]->[0]}; $_[0]; } sub min_max { $_[0]->[1]->{min} = min @{$_[0]->[0]}; $_[0]->[1]->{max} = max @{$_[0]->[0]}; $_[0]; } sub esrever { $_[0]->[1]->{esrever} = [ reverse @{$_[0]->[0]} ]; $_[0]; } sub sort_inline { @{$_[0]->[0]} = sort {$a<=>$b} @{$_[0]->[0]}; $_[0]; } sub append_stuff_inline { $_ .= $_[1] for @{$_[0]->[0]}; $_[0]; } my $result = $obj->min_max->append_stuff_inline(".foo")->count->esreve +r->sort_inline; print Dumper($result); __END__ $VAR1 = bless( [ [ '3.foo', '4.foo', '5.foo', '7.foo', '8.foo', '12.foo', '15.foo' ], { 'min' => 3, 'max' => 15, 'esrever' => [ '8.foo', '15.foo', '7.foo', '5.foo', '4.foo', '3.foo', '12.foo' ], 'count' => 7 } ], 'main' );

    The first element of the anonymous array $result is the modified array, the second is an anonymous hash containing the output of each pipeline subroutine.

    Of course, you could name the first argument to each method (i.e. $_[0]) adequately as $self, that makes no difference, but it might be easier to tell what is going on.

    With the above subs, you could also say

    my @chain = qw(min_max count sort_inline esrever); my ($result) = map { $obj->$_ } @chain;

    which of course doesn't let you pass in additional arguments to the methods called, which is why I didn't include append_stuff_inline() since that would be a no-op..

    Perl allows you to

    • set up arbitray data structures
    • modify subroutine arguments in place
    • return scalars, arrays, or whatever was passed in
    • ...

    It is up to you how you organize your data, what you pass into your subs, what you modify and what you return.

    perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
Re: multiple consumer of an array
by GotToBTru (Prior) on Oct 18, 2015 at 04:57 UTC

    Your question doesn't make much sense to me. Are the processors operating in parallel or sequential? Will one processor potentially modify the data?

    Dum Spiro Spero