Can I seek in a command's piped output?

wardy3 has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks

I'm having trouble with the seek command - it's returning "Illegal seek".

What I want to do is examine the first few lines of command output to determine which subroutine to call next. Then call that subroutine with the entire file.

This seems to work ok for an actual file:
seek $fh, 0, 0 or die $!;

But it doesn't work if the file handle is something like:
open my $fh, 'ls |';

I'm guessing this is working as designed but is there a clever way I can do this? Please keep in mind the output is pretty big and can sometimes be massive. I'm thinking around 100mb for an average output and even more is easily possible. (It's an XML file that I then pass the file handle to XML::Parser)

Thanks :)

Comment on Can I seek in a command's piped output? Select or Download Code

Replies are listed 'Best First'.
Re: Can I seek in a command's piped output? by graff (Chancellor) on Feb 12, 2009 at 02:42 UTC
The "seek" function only works on disk files (no matter what programming language you're using), and cannot work on pipes. If you need to make a routing decision after reading several lines from a pipe, you'll need to hold everything you read in a scalar or array until you are able to make your decision, and then send the accumulated stuff, along with whatever follows, to the appropriate recipient. Update: So for example, something like: `my @holdem; my %dispatch = ( one => \&subone, two => \&subtwo ); my $target; open( $pipe, "-\|", "ls" ); while ( <$pipe> ) { if ( $target ) { $target->( $_ ); } else { push @holdem, $_; my $decision = look_for_evidence( $_ ); if ( $decision =~ /one\|two/ ) { $target = $dispatch{$decision}; $target->( $_ ) for ( @holdem ); } } }` [download] (not tested)	[reply] [d/l]
Re^2: Can I seek in a command's piped output? by wardy3 (Scribe) on Feb 12, 2009 at 04:26 UTC
Thanks graff I'm not sure I follow your suggestion fully though. As I said, I'm trying to pre-check some XML before I let XML::Parser do its magic. `my $parser = XML::Parser->new( Handlers => { Start => \&s_start, End => \&s_end, Final => \&s_final, } ); my %device_info = $parser->parse($fh);` [download] It's the call to `parse` with a parameter of $fh that I'm not sure how to change. I need to peek inside $fh beforehand and set up different parser handler routines, depending on what I see. If I sneak a look before calling the parser, the first few lines of XML are lost and it dies with a malformed error, which is fair enough. So, I need to remember the first 5 or so lines (no problem) and then somehow insert these in the front of the file that `parse` is going to read? But I don't understand how ... Thanks	[reply] [d/l] [select]
Re^3: Can I seek in a command's piped output? by fullermd (Vicar) on Feb 12, 2009 at 11:35 UTC
Perhaps IO::Unread.	[reply]
Re^4: Can I seek in a command's piped output? by ikegami (Patriarch) on Feb 12, 2009 at 16:01 UTC
Re^4: Can I seek in a command's piped output? by wardy3 (Scribe) on Feb 16, 2009 at 00:26 UTC
Re^3: Can I seek in a command's piped output? by graff (Chancellor) on Feb 12, 2009 at 15:18 UTC
Ah -- sorry, I didn't catch the issue about passing the pipe file handle to XML::Parser. If the idea of using IO::Unread doesn't work out (though I expect it will do fine), there's also the possibility that you can use a script like the one I suggested, but instead of having two different subs in the dispatch table, you would have two different command lines -- one being your (separate) XML parser script invoked in a way to handle one type of stream, and the other(s) for handling the other type(s) of stream, and you just launch the right one as a subprocess: `my @holdem; my %dispatch = ( one => [ qw/my_parser --this_way/ ], two => [ qw/my_parser --that_way/ ], ); my $target; open( $pipe, "-\|", "some_xml_generator" ); # not "ls", obviously while ( <$pipe> ) { if ( $target ) { print $target $_; } else { push @holdem, $_; my $decision = look_for_evidence( $_ ); if ( $decision =~ /one\|two/ ) { open( $target, "\|-", @{$dispatch{$decision}} ) or die "failed to launch: $dispatch{$decision}: $!\n"; print $target $_ for ( @holdem ); } } }` [download] Your separate parser script uses the command line option to control how to set up event handlers for parsing. (updated snippet to fix stupid mistakes, like forgetting to use a list for "command", "option" when opening the pipe to the parser script.)	[reply] [d/l]