Re: peek at STDIN, to determine data type and then pass STDIN to a parser

Replies are listed 'Best First'.
Re^2: peek at STDIN, to determine data type and then pass STDIN to a parser by aral (Acolyte) on Jan 06, 2015 at 14:43 UTC
Thank you for the suggestion. Are you still talking about possibilities for STDIN? For normal filehandles I would be able to use a seek operation anyways. My problem seems to be limited to pipes.	[reply]
Re^3: peek at STDIN, to determine data type and then pass STDIN to a parser by MidLifeXis (Monsignor) on Jan 06, 2015 at 15:01 UTC
Yes. It is an option. It may not be the best option for your uses. I use iterators when schlepping event logs through my monitoring system, whether they come from a real-time event queue, stored log files, or current state of a system. To my consumer software, all of the data looks the same. The reason I suggested this technique is that it does not significantly increase the memory or filesystem requirements (as reading files fully into memory or storing in a temp file and processing would^Wcould do). It also allows the consumer (your XML processing in this case) to treat it as just a file handle. # UNTESTED # # This is for line-by-line reading, not block-by-block reading. # Adjust as necessary. sub create_iterator { my $original_fh = \*STDIN; my @cached_data = $original_fh->getline; # enough +to id the file my $data_type_id = identify_data_type( \@cached_data ); # Remove +from @cached if provided my $iterator = iter( sub { my $retval; if ( $data_type_id ) { $retval = $data_type_id; $data_type_id = undef; } elsif ( @cached_data ) { $retval = shift( @cached_data ); } else { $retval = $original_fh->getline; } return $retval; } ); return $iterator; } [download] --MidLifeXis	[reply] [d/l]
Re^4: peek at STDIN, to determine data type and then pass STDIN to a parser by aral (Acolyte) on Jan 08, 2015 at 08:57 UTC
I haven't tested that yet, but if I understand what you are doing here correctly, then that is a great idea! Thank you very much, very elegant solution for my problem! I'll get started right away on implementing / testing that. Never mind my workaround, elegant beats workaround!	[reply]
Re^4: peek at STDIN, to determine data type and then pass STDIN to a parser by aral (Acolyte) on Jan 08, 2015 at 10:12 UTC
Okay - I have gotten your code to work, and to do what I want. Now this may be a beginners question - but: How on earth do I get XML::Twig's parse function to use the iterator instead of a filehandle? `my $inputHandle = create_iterator(); $t->parse (<$inputHandle>);` [download] exits with error message "Not a GLOB reference at ./script.pl line xy.". And `$t->parse ($inputHandle);` [download] spits out: "not well-formed (invalid token) at line 1, column 4, byte 4 at /usr/lib/x86_64-linux-gnu/perl5/5.20/XML/Parser.pm line 187. at ./script.pl line xy." So how do I typecast the iterator in order to treat it like a file handle?	[reply] [d/l] [select]
Re^5: peek at STDIN, to determine data type and then pass STDIN to a parser by MidLifeXis (Monsignor) on Jan 09, 2015 at 17:48 UTC
Re^6: peek at STDIN, to determine data type and then pass STDIN to a parser by aral (Acolyte) on Jan 12, 2015 at 13:54 UTC
Re^6: peek at STDIN, to determine data type and then pass STDIN to a parser by aral (Acolyte) on Feb 17, 2015 at 11:44 UTC
Re^2: peek at STDIN, to determine data type and then pass STDIN to a parser by Anonymous Monk on Jan 06, 2015 at 19:41 UTC
what is the difference between reading line by line using the filehandle with the diamond operator and using an iterator?	[reply]
Re^3: peek at STDIN, to determine data type and then pass STDIN to a parser by MidLifeXis (Monsignor) on Jan 06, 2015 at 21:07 UTC
Nothing if you are just reading. The benefit can arise if you want to rearrange, inject, or modify the incoming data on the file handle and make the resulting stream look like a plain old file handle. I understand the OP to want to maybe inject a proper doctype into the data stream if needed. Perhaps not the best tool for this particular case, but a tool for the generic case. --MidLifeXis	[reply]