Re^2: Avoid memory error while extracting text block

Replies are listed 'Best First'.
Re^3: Avoid memory error while extracting text block by BrowserUk (Patriarch) on Apr 14, 2016 at 04:53 UTC
If the intent is to split this compound file into separate files, I think I'd do soemthing like this: `#! perl -slw use strict; # assuming the name of the compound file is supplied on the command li +ne until( eof( ARGV ) ) { my @buffer = <>; ## Put the <DOC +UMENT> tag into a clean buffer push @buffer, scalar( <> ), scalar( <> ); ## Ditto <TYPE> + & <SEQUENCE> my( $filename ) = ( my $line = <> ) =~ m[<FILENAME>(\S+)]; push @buffer, $line; open OUT, '>', $filename or die $!; print OUT for @buffer; print OUT until ( $_ = <> ) =~ m[</DOCUMENT>]; close OUT; }` [download] Note:That is untested code and will probably need tweaks. Eg. I'm not convinced that it will print the final tag to the output files; but then maybe you'd want to strip those anyway. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :) In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^4: Avoid memory error while extracting text block by wrkrbeee (Scribe) on Apr 14, 2016 at 12:59 UTC
Thank you!!	[reply]
Re^3: Avoid memory error while extracting text block by NetWallah (Canon) on Apr 14, 2016 at 03:48 UTC
If the "end of block" identifier cannot occur inside the document, you could read the file "one document at a time": `open my $fh, "<", "File/name" or die "Could not open file:$!"; $/="</DOCUMENT>"; while (my $doc = <$fh>){ # Process, or search through contents of $doc.... } close $fh;` [download] This is not an optical illusion, it just looks like one.	[reply] [d/l]
Re^3: Avoid memory error while extracting text block by wrkrbeee (Scribe) on Apr 14, 2016 at 01:55 UTC
Question: at the beginning of your code, you have a WHILE statement like so: while ( $_ = <$FH_IN`> ) !~ /START TOKEN/; I'm having difficulty interpreting this statement. So, here's my take, which I know is incorrect: while the Perl special operator has the same value as the current file handle/name, then I see the complement to the binding operator (i.e., does not bind) to my start token. I can't get there. What is the statement telling me?	[reply]
Re^4: Avoid memory error while extracting text block by GotToBTru (Prior) on Apr 14, 2016 at 03:41 UTC
Common usage has conditioned us to believe that the test condition of the while loop is what's inside the parentheses. In this case, the negative match operator is part of it as well. The body of the while loop is the 1. `1 while ($_ = <$FH_IN) !~ /START TOKEN;` is the equivalent of: `while (<$FH_IN> !~ /START TOKEN/) { 1 }` You can get Perl to show you how it interprets a statement like that using the B::Deparse module. `$: perl -MO=Deparse -e '1 while ($_ = <>) !~ /START TOKEN/' '???' until ($_ = <ARGV>) =~ /START TOKEN/; -e syntax OK` [download] Perl turns the while with a negative match operator into an until with the positive operator. But God demonstrates His own love toward us, in that while we were yet sinners, Christ died for us. Romans 5:8 (NASB)	[reply] [d/l] [select]